Data Mining and Quantifying Literature

    As we are learning, Digital Humanities contains of the various ways that data can be analyzed, quantified and organized by engaging with different materials. This week the use of Voyant Tools with our author texts was a way to expand on information visualization and alter the way that online texts can be understood. Through statistical and computational methods, quantifying literature counts and measures various aspects of text. This is also referred to as Distant Reading, where people are able to study the patterns on a larger scale. Voyant Tools does this with our author texts by measuring the occurrence of certain words or phrases used in texts by different authors. It also provides the lexile diversity, stylistic changes and authorship attributions that contribute to the sentence length and punctuation. Quantifying literature allows people to get a closer look at the significance of patterns over an entire piece of writing.

    "One thing he played with was the author’s use of “that” and “which.” He says, “In poring over these graphs, if I took all of his books and [using the computer] divided the ‘that’s’ by the ‘which’s’ and looked for a pattern, there was a straight line—you could tell when a novel of his was written, almost down to the year. At a very simple level, I would love to know these kinds of facts about the novelists I read all the time" (Quantifying Literature). This is a great example to consider when going forward with our analysis of our short stories and how we can use words to connect patterns to try and get a better understanding of the authors tactics when writing. 

    Data mining plays into this as well when extracting useful information from larger data sets using computational tools like Voyant Tools. With larger amounts of text, computers are able to analyze the electronic library and be mined for patterns. There are many considerations that come into play with data mining, "which are the statistical analyses of frequency, proximity, and value of individual data points within the larger sets. Principles like collocation of words are judged relative to other usage-- and in contrast to the sum of all other words in a sample" (Drucker, 112). Data mining and quantifying literature both make it possible for digital humanities to more broadly engage with literature and literary patterns. 

Comments

Popular posts from this blog

Blog Post 1: What is Digital Humanities?

What is Digital Humanities? Post #1 (Kira Littlefield)