Personal accounts in letters in comparison to factual information in diary entries

For our final project, Mary Medure and I collaborated together to compare and contrast James Merrill Linn’s diary entries and his letters to his mother and brother, John. We wanted to focus more on the content of his diary entries and letters rather than specific tools that documented his locations. Thus, instead of mapping, we chose to each transcribe different letters that would be eventually tagged in TEI and converted to a Digital Edition. Mary and I chose to transcribe letters that were written around the same time frame to compare the content in each letter. Additionally, we wanted to transcribe both the letters that were in the same time frame as the diary entries we transcribed earlier this semester. Mary transcribed the letter to John on February 11, 1862 and her diary entries she already transcribed were February 8-12, 1862. I transcribed the letter to Linn’s mother on February 19, 1862 and the diary entries from February 5-7, 1862. We used Voyant tools to compare his most commonly used words in his diary entries and letters.

Screen Shot 2014-12-12 at 1.32.59 PM

Transcription Difficulty of Letter to Mother on February 19, 1862

During the transcription process, Mary and I separately transcribed the 2 pages of each letter and then collaborated together to clarify the words we could not decipher. We would read the letters aloud to each other to make more sense of Linn’s experiences. However, some words were illegible so we went to the archives in the library to read the letters first hand. In Linn’s letter to his Mother, Mary and I could not read the words at the end of each page because of the binding of the documents. In the archives, we could not bend or fold the pages over to read the full words so we had to make some educated guesses related to the context of each sentence. In Pierazzo’s article, she raises a great point that “[j]udgment is necessarily involved in deciding what is in fact present [in the manuscript], as when an ambiguously formed character resembles two different letters; but the transcriber’s goal is to make an informed decision about what is actually inscribed at each point (Meulen and Tanselle, 1999, p. 201)” (465). This demonstrates that although Mary and I went to the archives for a second look at the documents, we still needed to make educated contextual guesses for multiple words for the document to make sense. For example, the screenshot on the left shows the word “tomatoes” cut off. In this section of the letter, he was talking about food and “toma-” is legible. Therefore, I needed to make an educated guess with regards to the context of the sentence to figure out the word that was cut off at the end of the page.

Color Coding of Events and Affiliation in Letter to Mother February 19, 1862

Color Coding of Events and Affiliation in Letter to Mother February 19, 1862

After the transcription process, we needed to start tagging the words that we felt were most important to include. To make the tagging process simpler, we color coded based on person/people, place, affiliation, object, state, trait, event, date, time and military role. In our diary entries, we did not color code to the same extent. We found that affiliation and person/people  were important enough to be a separate entity. For instance, we consider “Americans” to be an affiliation because it is a group of people associated to a specific location. We also categorized “war” and “battles” as events rather than places because they are at different locations. I did not have “event” as a category in the diary entry I transcribed because he would refer to the battles as their real names. As he writes to his mother, I believe that he refers to the battles generally because he is not using the letters as a reference to his specific locations and events.

Screen Shot 2014-12-12 at 5.33.53 PM

Color Coding of Descriptions and States in Letter to Mother February 19, 1862

After color coding, we noticed that the majority of words we highlighted were descriptions and states of well being.  Highlighted in turquoise are the descriptions and highlighted in gray are states, including weather and emotions. He is writing to his mother pertaining more of his personal experiences and his emotional responses to the war overall. After color coding the letters, we tagged the words that were highlighted and transferred the document to Oxygen to make a Digital Edition.

Letters to Mom & John

Letters to Mother and John most commonly used words

Voyant is a great tool to use when comparing contextual information in different documents. Therefore, Mary and I thought it would be a good idea to compare the diary entries to the letters using Voyant.  First, we took our my transcription files of Linn’s letter to his mother and brother, John, to show the most commonly used words. I noticed that he frequently used “hope”, “remember”, “little”, and “home”. These words are more of an expression and description of how he feels and his reactions to his surroundings as opposed to specific locations and people. He refers to “home” (Lewisburg) frequently, which makes sense because he is talking to his mother. Generic terms like “men” and “company” are commonly used because his letter to his mother is more of a representation of his personal experiences rather than a collection of locations he travels to or people he encounters.

Diary entries (both)

Linn’s diary entries most commonly used words

After analyzing our transcriptions of Linn’s letters to his mother and brother, Mary and I combined our diary entries to see the most commonly used words. We noticed that military men of different ranks were prevalent throughout his diary entries. Linn refers to specific people such as General Burnside, Captain Bennet, and many more. Comparatively speaking, “battle” appears to be used in both the letters and diary entries; however, “battle” is significantly larger, indicating it was used more, in his diaries. This supports the hypothesis that Linn’s diary entries are more of a personal account of places and people, whereas his letters to his family are more of his emotional experiences throughout the war.

Transcribing Linn’s letters to his mother and John around the same time as Linn’s previously transcribed diary entries gave Mary and I the support to claim that Linn’s diary entries are a personal collection for himself of locations he has traveled to and people he has encountered along the way. In contrast, Linn’s letters to his mother and John are more generic and express his feelings regarding the war rather than the a series of places and people. Color coding helped us significantly as we found that our hypothesis was correct in saying that Linn’s writing to his mother and brother were more emotional and personal whereas his diary entries were a collection of people and places for himself to remember later. To visualize the contrast in diary entries and letters written to family, Voyant is a great visualization tool to give the viewer a general idea of the premise and themes of each document. Overall, this project gave me a much better understanding of James Merrill Linn’s diary purpose in writing what he did in both his diary entries and letters to home.

Here are the links to my final TEI product!

Digital edition:

Works Cited
Linn, James Merrill. Diary. February 5-7, 8-12, 1862. MS. Bucknell University Archives and Special Collections, Lewisburg, PA.
Linn, James Merrill. Letter to John. February 11, 1862. MS. Bucknell University Archives and Special Collections, Lewisburg, PA.
Linn, James Merrill. Letter to Mother. February 19, 1862. MS. Bucknell University Archives and Special Collections, Lewisburg, PA.
Pierazzo, Elena. “A Rationale of Digital Documentary Editions.” Literary and Linguistic Computing. 26.4(2011): 463-477.



Analyzing Transcription with Tagging

Using close reading as a tool to analyze the transcription helped us to better understand the text. In class, we have used two tools/techniques, categorizing words by colors and TEI. Both of which were very useful, especially TEI, in categorizing important words. By tagging words, we analyzed every bit of information they might offer. Pierazzo stated “no transcription, however accurate, will ever be able to represent entirely the source document” (Pierazzo, 464). Although we can’t represent it entirely, we can at least get every bit of information we can.

Screen Shot 2014-10-26 at 9.31.15 PMCategorizing words by colors was a very interesting technique. It is simple yet efficient in highlighting significant words. We tagged words by categories (people, places, events, traits, states, etc.) and highlight them in different colors. As simple as it sounds, we encountered a lot of problems. We had to define what is and what isn’t tag-worthy. The categories were a problem themselves. We had many arguments on what should be in which category. For example, we had to define whether “Cossack” should be a place or an object. Like “Cossack”, many words were on the verges of two different categories. Overall, it was interesting to see how everyone chooses to tag and how Linn chooses to write down his observations. There were more tagging for people and objects than anything else. Linn seems to be more concerned with physical things.Screen Shot 2014-10-26 at 10.19.14 PM

TEI changes the way we can analyze text.Similarly to the colorization technique, TEI allows us to categorize words with a variety of options. With the help of TEI, we have endless options in tagging significant words. In Pierazzo’s article, Dristol stated “to all intents and purposes there is no limit to the information one can add to a text—apart, that is, from the limits of the imagination” (466) when commenting on the possibilities of TEI. While encoding with TEI, I had a lot of problems with deciding how many different codes I needed to analyze a word. We had a lot of options but we also had a lot of words. With TEI, I found myself tagging more words than with the colorization. I tagged a lot of words that were not significant. However, by tagging them, I was able to learn everything we could from the physical states of the object to the time and place.

The collaborative process works in our advantage. As we were able to work with each other, we made sure that we had the same guidelines for tagging these words. Pierazzo said that the opinion of the editor changes the interpretation of the transcription. By deciding on the tagging of certain words, we can have similar interpretation of the text, therefore prevents us from deviating from the accepted guideline.

What I Learned From Tagging

Learning how to mark up our documents and then taking what we learned and applying it to our journal entires has allowed me to obtain a deeper understanding about the way that Linn writes about the war. Although the process was tricky and frustrating at points purely because of my lack of experience, I believe that it brought focus to the specific types of things that Linn talks about when he is writing. For example, when going through the version of the Google Docs that was marked up with colors, it was clear that some of the colors were used more than others. For me, I would say that blue and orange were the two most used, while purple, brown, and cyan were the least used. This comments on Linn’s writings because it gives us insight into his writing style, with a focus on people and objects. Although he is descriptive in some places, he sometimes jumps from topic to topic, which is why we see less cyan, brown, and purple.

A lot of what Pierazzo talks about in her piece was visible in our process. For example, there was a large variety in the amount of tagging that occurred, with some people tagging most words, while some just picked out the important ones.  This resonates in Pierazzo’s article when she says, “So, we must have limits, and limits represent the boundaries within which the hermeneutic process can develop”(466). One of my paragraphs is below (A), and i chose to only tag the words that I thought were important and relative.

Screen Shot 2014-10-26 at 9.21.11 PM


Screen Shot 2014-10-26 at 9.20.56 PM



Although I think i did not make a mistake in being sparse, other people heavily marked up their entries (B) which made me come to think about how they thought those words were important compared to how I choose to select my words. Again, this links back to Pierazzo when she says that a digital edition includes words and sections that are “considered meaningful to the editors” (475) and “that one cannot declare once
and for all which features should be included” (475). The degree to which each person marked up their piece was one of the most interesting factors when I looked over everyone else’s entries.

Screen Shot 2014-10-26 at 10.09.45 PM

I also learned a lot in the editorial process, primarily that it is harder to come to conclusions on basic stuff like whether a boat is a place or object than I thought. When we were talking about the cossack and different ways to go deeper in tagging, it changed the way that i thought about this tagging and my reading. When tagging mine, i had a deep internal struggle about how to tag battery, considering that like cossack, it could be both. My struggle was the externalized when we came to class and discussed cossack. When talking about what to mark up and what not to mark up, Pierazzo says that it “depends either on the particular vision that we have of a particular manuscript or on practical constraints” (465). For me, the idea of particular vision is why we disagreed. I saw battery as an place, and when asked about cossack it made sense to me that it would be a place too. Boats represent places for me, but someone made a point that to Linn, they are objects not places, and that makes sense to me. By making it an object but adding the type boat, we were able to come to a consensus. However, considering that we spent so much time arguing over one word, it makes me dread what it must be like to go through an entire edited text. I thought that this was interesting, gave me a better look into the kinds of words and descriptions that Linn uses, and taught me some new useful skills.

close reading

The process of marking up my transcription has effected my understanding of the text. It helped me to understand the context and learn specific words. For example, I had no idea that a battery was a place; rather, I thought a battery was an object. By looking at it closely, I was able to come to a realization that the battery was a significant place during the Civil War. In my specific entry, I found color coding and marking up words very interesting because I could see whether or not people agreed with what I coded them as. For example, I coded Rengler’s Old Mill as a place, whereas someone else might have thought of it as an object

In her article, Elena Pierazzo speaks about limits. She said, “So, we must have limits, and limits represent the boundarieswithin which the hermeneutic process can develop”(466). Therefore she meant that we couldn’t mark up everything because then we wouldn’t be limiting ourselves.  “The challenge is therefore to select those limits that allow a model which is adequate to the scholarly purpose for which it has been created (466)”. I faced this problem when I was choosing which words to tag. I had to limit myself with the tagging; otherwise I would’ve gone overboard and tagged the whole paper. It was hard to choose which ones I wanted to tag because they all seemed taggable. After I got the hang of it, it became easier and limiting the words that I tagged become more natural and less of a process.

In Pierazzo’s article G.T Tanselle says: “The process of selection is inevitably an interpretative act: what we choose to represent and what we do not depends either on the particular vision that we have of a particular manuscript or on practical constraints”(467). I related to this when I was trying to decide whether something was an object or a place. For example, our whole class was disputing over whether a Cossack was a place or an object. Some people have particular visions as boats being objects where others have envisions of boats being places. My feeling was that Cossack was a place because it’s a place that people go to. Another time I was interpreting things while selecting was when I had to select whether something was just a persons name or a role name. For example, I interpreted col as being a role name. So I selected the tag “roleName” opposed to “persName.”

Another point made in Pierazzo’s article was when E. Pierazzo said “Capital letters were preserved and marked; Austen used these inconsistently for any part of speech, so we have distinguished nouns, verbs, pronouns, adjectives, articles, and adverbs(470).” I agree with this because if things weren’t being capitalized, I’d have trouble distinguishing what the words were and when new sentences were starting. Also without the proper punctuation, it would be hard to follow the entry and understand what was going on. “The original fluctu- ating punctuation was also kept”(470). If it weren’t kept, there would be no proper flow to the diary entry

As a class we came to an agreement on whether specific words were places or objects. At first, everyone would bicker but by the end we all came to an agreement with what we thought the word should be categorized under. I found this process very engaging, yet frustrating, but overall I liked it!

Below is me tagging the word wagon track as an “ object type” in oxygen:


Tagging in Oxygen 

Below is me marking the word “wagon track” with the color orange to represent an object:

Screen shot 2014-10-26 at 7.33.28 PM

Color Coding

As seen in the two pictures the things I marked up and tagged as objects ended up being objects.

In conclusion I enjoyed close reading. It really helped me understand the context of the diary entry more. At first I thought oxygen was going to be extremely difficult to use and overwhelming but it turned out to be very maneuverable and to my liking!

Oxygen Mark Up of Diary 60

During the markup process of my Linn diary transcription, I learned a lot about the context of Linn’s writings through close reading. It allowed me to focus on certain words that helped me get a greater understand of the t text as a whole. Even collaborating with the editorial group in class helped me get a better grip on how to mark up certain words. One such instance that helped me decide what to mark certain words was the debate over whether a boat is a place or an object. In my opinion, I believe a boat is a place; it is extremely the case when it is named like the “Cossack.” The “Cossack” seemed to have much more of a meaning and presence than just an object. After a lengthy and intense discussion on why our class felt what they felt, we decided to mark up any boat, regardless of a proper noun, an object. We decided that this “object” would have a more descriptive mark up.

How do we mark up a boat?

How do we mark up a boat?

Elena Pierazzo really categorizes the, in Jakacki’s words, “richness of the marked up text as a form of intellectual engagement with its interpretation.” In her article, “A Rationale of Digital Documentary Editions” is exemplary of how people should mark up transcriptions. One way she wants people to consider the marking up process is to have “have limits, and limits represent the boundaries within which the hermeneutic process can develop”(Pierazzo). Basically, she believes that we cannot mark up and focus on every single word. That would one, very time-consuming, and, two, counterproductive. In order to interpret Linn’s transcriptions, we had to make decisions on what was important to us. If everything was marked up, wouldn’t we just end up at the beginning? We need to see the relationship between certain things and this is, ultimately, intellectual engagement.

Screen Shot 2014-10-22 at 5.43.25 PM

persName and placeName

Another point that Pierazzo brings up is that we ultimately choose what we rant to represent. There needs to be a meaning behind the mark ups. She states that “the process of selection is inevitably an interpretative act: what we choose to represent and what we do not depends eitherr on the particular vision that we have a particular manuscript or on practical constraints”(Pierazzo). There are certain influences that make us mark up certain words. In my case, I focused on people and places. What I found was that after zoning in on one particular area, I could then go deeper and mark up those words even further. While having some technical difficulties in the level I could describe different people’s roles, I at least was trying to make that one of my main goals.

The last crucial point that Pierazzo argues is that letters are not just marks on a paper. They are symbols we chose to make meanings for. Robinson insists, “‘an ‘i’ is not an ‘i’ because it is a stroke with a dot over it. An “i” is an “i” because we alls agree that it is an ‘i’’”(Pierazzo). Taking this into consideration, our class decided that we would always use “&” for every time Linn uses “&.” After coming to this conclusion, we had to make it clear in Oxygen that we wanted “&” to also mean “and.” An ampersand is not just some weird symbol, we came to a final conclusion, along with the English society, that an ampersand means and.

After using Oxygen and reading the Pierazzo article, I really have a better understanding for Diary 60 of the Linn transcription. Close reading individual words contributes to the overall meaning of Linn’s diary.