Categories
Blog #2 Digging

Utilizing Voyant for Distant Reading tools

Voyant is a great resource to find trends in specific documents. In particular, I will be using “Collocate Clusters” to make connections between words and ideas in a series of comprised diary entries by James Merrill Linn. In Linn’s diary, he writes, “War is horrible. I first saw the pomp & circumstance – the battle field – the dead and wounded now the prison ship.” The hypothesis poses the question, “For Linn, is this a turning point where he loses his innocence?” Using Voyant to see relationships between words, I will analyze to see if I can draw any conclusions from this hypothesis.

Screen Shot 2014-09-24 at 3.09.49 PM
Relationship between “boat” and “men” in Linn’s diary entries

At first, I tried using word cloud to look at trends in the diary. The two words that stood out to me were “boat” and “men”. Boat did not appear to be as prominent as other words, as boat was only used 81 times in the diary entries. However, after transcribing a diary page about Linn’s experience boating, there were many words that related to boat in the diary, including men, captain, and regiment. Instead of using word cloud, I decided to look at the relationship between boat and other common words. Therefore, I added the comprised Linn diary entries and edited my settings by putting in stop words. Then, I typed in boat to see the first few connections. As a result, men not only was one of the most common words used in the entire document, but it was also related to boat in the diary.

Screen Shot 2014-09-24 at 3.47.56 PM
Relationship between “boat”, “men”, and “wounded” in Linn’s diary entries

Next, I wanted to look at connections with one of the words used in the given quote by Linn. I chose “wounded”, mostly because I remember transcribing it in my specific diary entry.  I typed “wounded” in the search bar at the top to hopefully find connections with boat and men. I found that wounded was not as commonly used in the diary as men because wounded was only used 31 times whereas men was used 133 times. Although wounded was not used as often, there was a connection to men. Therefore, wounded was indirectly connected to boat because men and boat had a greater connection.

This is useful information for distant reading because the connecting words and the sizes of the words show how often Linn used them and the major and minor connections between those words. Unfortunately, this resource does not help me come to a conclusion about Linn’s loss of innocence because it does not reveal any trends. For example, the hypothesis was asking if Linn lost his innocence halfway through the transcription but I am unable to draw any conclusions because there’s no time frame for the connections. This means that I cannot easily find within the document where and when these words were used. Word cloud may be more useful in terms of finding trends, but Links is better for making connections and seeing how words relate within a document. Using both of these tools together could be extremely beneficial by making common connections between words or ideas, and also by showing you where the words are specifically in the document and how often they are used. Because I could not draw any conclusions relative to the hypothesis, I am posing a question about distant reading in general. When doing distant reading, is it better to begin by making connections with words or by finding specific trends or patterns of the words? I believe that these distant reading tools go hand and hand; however, depending on what you are searching for, one can be more helpful than the other. In our case, when analyzing Linn’s diary, both Word Cloud and Links could be used together to find the best result.