Personal accounts in letters in comparison to factual information in diary entries

For our final project, Mary Medure and I collaborated together to compare and contrast James Merrill Linn’s diary entries and his letters to his mother and brother, John. We wanted to focus more on the content of his diary entries and letters rather than specific tools that documented his locations. Thus, instead of mapping, we chose to each transcribe different letters that would be eventually tagged in TEI and converted to a Digital Edition. Mary and I chose to transcribe letters that were written around the same time frame to compare the content in each letter. Additionally, we wanted to transcribe both the letters that were in the same time frame as the diary entries we transcribed earlier this semester. Mary transcribed the letter to John on February 11, 1862 and her diary entries she already transcribed were February 8-12, 1862. I transcribed the letter to Linn’s mother on February 19, 1862 and the diary entries from February 5-7, 1862. We used Voyant tools to compare his most commonly used words in his diary entries and letters.

Screen Shot 2014-12-12 at 1.32.59 PM

Transcription Difficulty of Letter to Mother on February 19, 1862

During the transcription process, Mary and I separately transcribed the 2 pages of each letter and then collaborated together to clarify the words we could not decipher. We would read the letters aloud to each other to make more sense of Linn’s experiences. However, some words were illegible so we went to the archives in the library to read the letters first hand. In Linn’s letter to his Mother, Mary and I could not read the words at the end of each page because of the binding of the documents. In the archives, we could not bend or fold the pages over to read the full words so we had to make some educated guesses related to the context of each sentence. In Pierazzo’s article, she raises a great point that “[j]udgment is necessarily involved in deciding what is in fact present [in the manuscript], as when an ambiguously formed character resembles two different letters; but the transcriber’s goal is to make an informed decision about what is actually inscribed at each point (Meulen and Tanselle, 1999, p. 201)” (465). This demonstrates that although Mary and I went to the archives for a second look at the documents, we still needed to make educated contextual guesses for multiple words for the document to make sense. For example, the screenshot on the left shows the word “tomatoes” cut off. In this section of the letter, he was talking about food and “toma-” is legible. Therefore, I needed to make an educated guess with regards to the context of the sentence to figure out the word that was cut off at the end of the page.

Color Coding of Events and Affiliation in Letter to Mother February 19, 1862

Color Coding of Events and Affiliation in Letter to Mother February 19, 1862

After the transcription process, we needed to start tagging the words that we felt were most important to include. To make the tagging process simpler, we color coded based on person/people, place, affiliation, object, state, trait, event, date, time and military role. In our diary entries, we did not color code to the same extent. We found that affiliation and person/people  were important enough to be a separate entity. For instance, we consider “Americans” to be an affiliation because it is a group of people associated to a specific location. We also categorized “war” and “battles” as events rather than places because they are at different locations. I did not have “event” as a category in the diary entry I transcribed because he would refer to the battles as their real names. As he writes to his mother, I believe that he refers to the battles generally because he is not using the letters as a reference to his specific locations and events.

Screen Shot 2014-12-12 at 5.33.53 PM

Color Coding of Descriptions and States in Letter to Mother February 19, 1862

After color coding, we noticed that the majority of words we highlighted were descriptions and states of well being.  Highlighted in turquoise are the descriptions and highlighted in gray are states, including weather and emotions. He is writing to his mother pertaining more of his personal experiences and his emotional responses to the war overall. After color coding the letters, we tagged the words that were highlighted and transferred the document to Oxygen to make a Digital Edition.

Letters to Mom & John

Letters to Mother and John most commonly used words

Voyant is a great tool to use when comparing contextual information in different documents. Therefore, Mary and I thought it would be a good idea to compare the diary entries to the letters using Voyant.  First, we took our my transcription files of Linn’s letter to his mother and brother, John, to show the most commonly used words. I noticed that he frequently used “hope”, “remember”, “little”, and “home”. These words are more of an expression and description of how he feels and his reactions to his surroundings as opposed to specific locations and people. He refers to “home” (Lewisburg) frequently, which makes sense because he is talking to his mother. Generic terms like “men” and “company” are commonly used because his letter to his mother is more of a representation of his personal experiences rather than a collection of locations he travels to or people he encounters.

Diary entries (both)

Linn’s diary entries most commonly used words

After analyzing our transcriptions of Linn’s letters to his mother and brother, Mary and I combined our diary entries to see the most commonly used words. We noticed that military men of different ranks were prevalent throughout his diary entries. Linn refers to specific people such as General Burnside, Captain Bennet, and many more. Comparatively speaking, “battle” appears to be used in both the letters and diary entries; however, “battle” is significantly larger, indicating it was used more, in his diaries. This supports the hypothesis that Linn’s diary entries are more of a personal account of places and people, whereas his letters to his family are more of his emotional experiences throughout the war.

Transcribing Linn’s letters to his mother and John around the same time as Linn’s previously transcribed diary entries gave Mary and I the support to claim that Linn’s diary entries are a personal collection for himself of locations he has traveled to and people he has encountered along the way. In contrast, Linn’s letters to his mother and John are more generic and express his feelings regarding the war rather than the a series of places and people. Color coding helped us significantly as we found that our hypothesis was correct in saying that Linn’s writing to his mother and brother were more emotional and personal whereas his diary entries were a collection of people and places for himself to remember later. To visualize the contrast in diary entries and letters written to family, Voyant is a great visualization tool to give the viewer a general idea of the premise and themes of each document. Overall, this project gave me a much better understanding of James Merrill Linn’s diary purpose in writing what he did in both his diary entries and letters to home.

Here are the links to my final TEI product!

Digital edition:

Works Cited
Linn, James Merrill. Diary. February 5-7, 8-12, 1862. MS. Bucknell University Archives and Special Collections, Lewisburg, PA.
Linn, James Merrill. Letter to John. February 11, 1862. MS. Bucknell University Archives and Special Collections, Lewisburg, PA.
Linn, James Merrill. Letter to Mother. February 19, 1862. MS. Bucknell University Archives and Special Collections, Lewisburg, PA.
Pierazzo, Elena. “A Rationale of Digital Documentary Editions.” Literary and Linguistic Computing. 26.4(2011): 463-477.



Linn’s Journey through the Croatan Sound depicted by GIS mapping

ArcGIS online provides viewers to visualize and interactively map historical events. Working with GIS has given me a better understanding of James Merrill Linn’s locations, battles and overall journey throughout the Civil War. My specific diary entry was dated between February 5-7, 1862, as Linn focuses on his expedition towards Roanoke Island. He starts at Stumpy Point, anchors on the shore of the Island the next day in the Croatan Sound, and then aboard the Spaulding, he travels around the bend of Roanoke Island where a cannonade commences. This cannonade depicts the start of the Battle of Roanoke Island.

Screen Shot 2014-11-19 at 6.46.34 PM

Roanoke Island in respect to Tyrell Shore

Dale Hartman and I hypothesized that Linn misinterpreted his location as he said, “the gun boats has moved up into the Channel between Roanoke Island and the Tyrell shore.” However, as shown to the left, Tyrell shore (shown as Tyrell county on the map) is not in close proximity to Roanoke Island, which is where the battle took place the same day. As Dale and I could stand corrected, we believe that Linn has made an error in his location. By just reading the diary entry, we would have assumed he did actually travel in the Channel between Tyrell Shore and Roanoke Island. However, GIS and maps in general give us the tools and resources to track Linn’s journey throughout the Civil War so we can better understand his locations day by day.

Bodenhamer discusses in his article the significance and consequences of GIS mapping. He says that “making data visual spurred intuitive interpretation- recognition of patterns, for instance- that remained hidden in statistical analysis” (17-18). For our purpose, GIS helped us visualize Linn’s experiences through the Civil War that we could not necessarily realize while just reading his diary entries.  As I mentioned before, Dale and I would not have been able to question Linn’s record of location on February 7, 1862 without using GIS to visualize his path to Roanoke Island.

Bodenhamer also mentions the importance of layers in GIS. He writes, “[g]eographic information systems operate a series of layers, each representing a different theme and tied to a specific location on planet earth. These layers are transparent, although the user can make any layer or combination of layers opaque while leaving others visible” (27). Because maps change over time as scholars continuously make new discoveries, layers benefit the viewer by giving a better sense of historical background of the map at the time. Additionally, layers focus on very specific events in history. For example, the layers “RoanokeRebels” and “Roanoke1862” are directed towards Civil War studies of battles in 1862. The layers are user-friendly and make the map more relevant to the viewer’s field of study.

Bodenhamer raises the point that there are some setbacks to GIS. For instance, scholars are trying to address the issue of “how… we as humanists make GIS do what it was not intended to do, namely, represent the world as culture and not simply mapped locations”(23). In some fields of research, cultural and social differences are critical and should be represented in the maps. However, when tracking Linn’s locations and comparing them to what he wrote in his diary entries, GIS serves its purpose in showing locations and different layers.

GIS has given me an overall better understanding of Linn’s journey throughout the Civil War. GIS and the Map app have essentially brought the diary entries to life and have made it easier to comprehend his path during the war. Here is my final product of Linn’s pathway between February 5-7, 1862:

Screen Shot 2014-11-19 at 7.24.07 PM

Map of Linn’s Journey February 5-7, 1862

Map app link:



Importance of Tagging in TEI

Close reading is a great tool to help categorize people, places, events, and more within a specific text. Using TEI, we analyzed Linn’s diary by choosing what words to tag. For example, one of our class discussions consisted of whether or not “cossack” should be tagged as a place or object. I argued that a cossack, which is a type of boat, is always an object but depending on the context of the sentence, it can be a place, too. In Linn’s diary, cossack was frequently used so we knew that we needed to tag it. We decided to tag it as object because in some instances in the diary, cossack wasn’t always a place.
Screen Shot 2014-10-26 at 3.52.50 PM

However, we resolved the place vs. object dilemma by categorizing it as an object but by also specifying what kind of object it is. Thus, we specified cossack by placing an object type tag as “boat”. By consulting with my peers, I realized that there can be multiple different perspectives and outlooks of a word, phrase or even an entire document. Cossack is a great example of a word that can be interpreted differently depending on its context. I may feel strongly that cossack is an object, but others can interpret it differently. Collaborating throughout Linn’s diary will allow our class to determine and classify words, which will also help clarify different opinions and interpretations.

In general, marking up the transcription has helped me better understand the context and circumstances of Linn. For instance, we individually started separating the people in the database by union and confederate army. Most of the people are union, which is to be expected because Linn is part of the union army and talks about the military men surrounding him. I also learned a little more about the men in the specific diary entry I transcribed.Screen Shot 2014-10-26 at 4.13.46 PMI thought that Alcot, Ripley and Prawe were all part of the union army but they were actually reporters who were supposedly neutral during the war. This helped clarify the context of the diary entry when I knew they were not directly involved in the war. As shown above, Alcot, Ripley and Prawe are reporters for the Herald & Inquirer. Before we started categorizing people, I assumed they were part of the military and I was confused why a newspaper company was mentioned. Now the context of this diary entry makes more sense!

In Pierazzo’s essay “A Rationale of Digital Documentary Editions”, she discusses the process of tagging selection. One of the most challenging aspects of specifying by tagging in TEI is knowing when to stop. You could essentially tag everything but that’s very time-consuming and does not distinguish significant phrases or words from less important ones. Pierazzo writes, “…we might conclude that one possible and tempting answer to the question ‘where to stop’ could be ‘nowhere’, as there are potentially infinite sets of facts to be recorded” (466). This causes a wide variation in interpretation. If there’s no limit, then one would think there is essentially no structure or guidelines between different articles. Although there may not be a hard limit, “the vast majority of decisions we make in this realm are decisions on which all (or most) competent readers agree or seem likely to agree (p. 196)” (466). Pierazzo makes the point that the tags made are (almost) universally acceptable and understood. There is room for interpretation, but the tags are not completely random. Therefore, there is some order when tagging words. Additionally, Pierazzo feels that when tagging, it is important to consider your audience. She writes, “to achieve the purpose of the edition and meet the editors’ needs, one needs to ask which features bear a cognitive value, that is, which are relevant from a scholarly point of view” (469). This demonstrates that the person marking up the document must consider the audience and make thoughtful, educated decisions when tagging. Although there’s no limit or “correct” way to tag words, Pierazzo believes that there are ways to make it somewhat orderly and structured while also having room for different interpretation.

Utilizing Voyant for Distant Reading tools

Voyant is a great resource to find trends in specific documents. In particular, I will be using “Collocate Clusters” to make connections between words and ideas in a series of comprised diary entries by James Merrill Linn. In Linn’s diary, he writes, “War is horrible. I first saw the pomp & circumstance – the battle field – the dead and wounded now the prison ship.” The hypothesis poses the question, “For Linn, is this a turning point where he loses his innocence?” Using Voyant to see relationships between words, I will analyze to see if I can draw any conclusions from this hypothesis.

Screen Shot 2014-09-24 at 3.09.49 PM

Relationship between “boat” and “men” in Linn’s diary entries

At first, I tried using word cloud to look at trends in the diary. The two words that stood out to me were “boat” and “men”. Boat did not appear to be as prominent as other words, as boat was only used 81 times in the diary entries. However, after transcribing a diary page about Linn’s experience boating, there were many words that related to boat in the diary, including men, captain, and regiment. Instead of using word cloud, I decided to look at the relationship between boat and other common words. Therefore, I added the comprised Linn diary entries and edited my settings by putting in stop words. Then, I typed in boat to see the first few connections. As a result, men not only was one of the most common words used in the entire document, but it was also related to boat in the diary.

Screen Shot 2014-09-24 at 3.47.56 PM

Relationship between “boat”, “men”, and “wounded” in Linn’s diary entries

Next, I wanted to look at connections with one of the words used in the given quote by Linn. I chose “wounded”, mostly because I remember transcribing it in my specific diary entry.  I typed “wounded” in the search bar at the top to hopefully find connections with boat and men. I found that wounded was not as commonly used in the diary as men because wounded was only used 31 times whereas men was used 133 times. Although wounded was not used as often, there was a connection to men. Therefore, wounded was indirectly connected to boat because men and boat had a greater connection.

This is useful information for distant reading because the connecting words and the sizes of the words show how often Linn used them and the major and minor connections between those words. Unfortunately, this resource does not help me come to a conclusion about Linn’s loss of innocence because it does not reveal any trends. For example, the hypothesis was asking if Linn lost his innocence halfway through the transcription but I am unable to draw any conclusions because there’s no time frame for the connections. This means that I cannot easily find within the document where and when these words were used. Word cloud may be more useful in terms of finding trends, but Links is better for making connections and seeing how words relate within a document. Using both of these tools together could be extremely beneficial by making common connections between words or ideas, and also by showing you where the words are specifically in the document and how often they are used. Because I could not draw any conclusions relative to the hypothesis, I am posing a question about distant reading in general. When doing distant reading, is it better to begin by making connections with words or by finding specific trends or patterns of the words? I believe that these distant reading tools go hand and hand; however, depending on what you are searching for, one can be more helpful than the other. In our case, when analyzing Linn’s diary, both Word Cloud and Links could be used together to find the best result.