What I Learned From Tagging

Learning how to mark up our documents and then taking what we learned and applying it to our journal entires has allowed me to obtain a deeper understanding about the way that Linn writes about the war. Although the process was tricky and frustrating at points purely because of my lack of experience, I believe that it brought focus to the specific types of things that Linn talks about when he is writing. For example, when going through the version of the Google Docs that was marked up with colors, it was clear that some of the colors were used more than others. For me, I would say that blue and orange were the two most used, while purple, brown, and cyan were the least used. This comments on Linn’s writings because it gives us insight into his writing style, with a focus on people and objects. Although he is descriptive in some places, he sometimes jumps from topic to topic, which is why we see less cyan, brown, and purple.

A lot of what Pierazzo talks about in her piece was visible in our process. For example, there was a large variety in the amount of tagging that occurred, with some people tagging most words, while some just picked out the important ones.  This resonates in Pierazzo’s article when she says, “So, we must have limits, and limits represent the boundaries within which the hermeneutic process can develop”(466). One of my paragraphs is below (A), and i chose to only tag the words that I thought were important and relative.

Screen Shot 2014-10-26 at 9.21.11 PM


Screen Shot 2014-10-26 at 9.20.56 PM



Although I think i did not make a mistake in being sparse, other people heavily marked up their entries (B) which made me come to think about how they thought those words were important compared to how I choose to select my words. Again, this links back to Pierazzo when she says that a digital edition includes words and sections that are “considered meaningful to the editors” (475) and “that one cannot declare once
and for all which features should be included” (475). The degree to which each person marked up their piece was one of the most interesting factors when I looked over everyone else’s entries.

Screen Shot 2014-10-26 at 10.09.45 PM

I also learned a lot in the editorial process, primarily that it is harder to come to conclusions on basic stuff like whether a boat is a place or object than I thought. When we were talking about the cossack and different ways to go deeper in tagging, it changed the way that i thought about this tagging and my reading. When tagging mine, i had a deep internal struggle about how to tag battery, considering that like cossack, it could be both. My struggle was the externalized when we came to class and discussed cossack. When talking about what to mark up and what not to mark up, Pierazzo says that it “depends either on the particular vision that we have of a particular manuscript or on practical constraints” (465). For me, the idea of particular vision is why we disagreed. I saw battery as an place, and when asked about cossack it made sense to me that it would be a place too. Boats represent places for me, but someone made a point that to Linn, they are objects not places, and that makes sense to me. By making it an object but adding the type boat, we were able to come to a consensus. However, considering that we spent so much time arguing over one word, it makes me dread what it must be like to go through an entire edited text. I thought that this was interesting, gave me a better look into the kinds of words and descriptions that Linn uses, and taught me some new useful skills.

Things I learned through tagging

The process of marking up my transcription was definitely very helpful as it allowed me to make observations that I would not have otherwise made. The first step was for us to tag people, places, objects, events, etc. in our our own diary entry. Before doing the markups in XML, we made a class google document with all of our diary entries in order. Each category (people, places, objects, etc.) eScreen Shot 2014-10-26 at 5.45.56 PMach had its own color and we were instructed to highlight the words accordingly. For me, this was the most useful step. During this step was when I decided which words were important enough to be highlighted. For example, a person was referred to in Linn’s entry as “gentleman,” but I decided that he was someone Linn saw in passing and was not essential to be marked up.

Another helpful part of this step was that when each of my classmates and I finished the markups I was able to scroll through the document and see which color was the most prominent. It turned out that blue and orange, which represented people and objects, appeared to be the two most seen colors. On the other hand, red represented events and this was probably the most seldom seen color. This allowed me to observe that Linn did not view the specific events, accomplishments, or defeats of the battle as significant to write about, but instead Linn focused on the people and objects that directly involved him on a day-to-day basis.

Lastly, through scrolling through the document I was able to see that each person chose to focus on tagging different word types. For example, there were some diary entries that had numerous purple markups (dates and times) and others that had zero. I do not think that this difference came about because of Linn, but this occurred because of the students’ different ideas of what they viewed as important.  This observation connects heavily to the Pierazzo reading. Pierazzo focused a lot on how the digital medium allows for greater possibilities for representation, which proved to be true. Additionally, I was able to see the large role individuality and perspective plays in marking up documents that Pierazzo discussed. By actually completing markups and comparing mine to that of my classmates, I now agree with Pierazzos statement that, “a digital edition includes features of the original document that are considered meaningful to the editors” (475). The digital edition is exactly so, but I may be difficult to understand this without actually going through the process for yourself.

After highlighting in the google document, we used XML in order to tag the words. Personally, I think it is significantly harder to make observations in this medium. This is because the google document allowed for both close and distant reading analyses to be made, which cannot be done using the XML. In XML only close reading analysis can be easily made. I definitely used this method as for each word that I tagged, I first analyzed the importance of it in terms of Linn and his entry. Based on my analysis I decided whether the word was worth being tagged.  This connects to another central topic of Pierazzo’s article, which was on “when to stop.” Since the digital world does not place many limitations on the editors, how do the editors know enough is enough? Personally, I believe it is better to under tag than over tag, because if every other word is tagged it is harder to see what is truly meaningful.

Another aspect of this project that was an eye-opener for me was the class debate. During this class, I felt like I was at an editorial staff meeting. We were sitting in a circle comparing specific words that some of us tagged as different word types. For example, cossack was a word that was of huge debate. A portion of the class felt that cossack was a place, but others argued that it was an object. It was interesting to take part in this debate and to in the end agree on one of the two. As a class we decided to mark cossack as an object. We came to this conclusion because although sometimes cossack is mentioned as a place in which Linn is going to, this is not always the case. However, it can not be argued against that cossack is always an object since it is a boat. I thought it was very interesting to see how much passion was put into this argument over tagging one single word.

I also found that this act of collaboration was helpful in enhancing my TEI file. Prior to this class, I did not go into detail on any of my tags. I merely used the word categories given to me, without further identifying. As a class we agreed that Beaver was someone of importance based on how frequently he Screen Shot 2014-10-26 at 5.59.56 PMwas discussed throughout the diary entries. Since he was important, we decided to give him an attribute. As a group we thought it was appropriate to give Beaver the type military.

I definitely had a lot of fun doing this project and I learned a lot about digital editions and the battles that editors can face in the process of publishing. Sometimes freedom is a bad thing because it can be difficult to place limits on oneself. Although a digital edition will never be the same as its source document I enjoyed trying to preserve it as much as I could. For example, in the TEI the line breaks match up with that of the original copy. I also kept Linn’s abbreviations such as his ampersands. Although there are some aspects that can not be replicated, such as the specific spacings between his written words, it is important to maintain as much as the digital allows.