Blog Post 4: Information Visualization and Distant Reading

     With so much information readily available to us, it would take an incredibly long time to engage with it all. A quicker way to consume data is through visualization: an expression of quantitative data in the form of graphics. Visualizing data can help us to identify patterns in large amounts of information - in fact, according to The Digital Humanities Coursebook, anything that can be given a numerical value can be turned into some sort of visualization. While this appears to be a positive idea, there are ways to skew the data so that the information may be purposefully misinterpreted. The way that the creators of a visualization use their scale can directly affect how the data is perceived. For example, if a graph is made with excessive space above the data, the data may appear less impactful. This leads me to a quote which stuck out to me from the reading "...visualization should be treated with skepticism, rather than simply accepted".

    There are many elements to a visualization, including networks, which are systems of entities that are explicitly connected. This is perfectly demonstrated in The Six Degrees of Frances Bacon. While in my opinion this visualization is pretty overwhelming, it perfectly illustrates a network of related nodes, which are specific entities of a network. The links are very well defined, and the key allows for help in interpretation.

    Returning to the idea that there is infinite amounts of information out there, we can agree that there is no way it could all be read by humans any time soon. This is where data mining could help us. It is an automated analysis program which extracts meaningful information from large data files. A subpart of data mining is called text analysis, which focuses on analyzing language. This is the process that was used in the project Yesterday, Today, and Tomorrow. The specific form of data mining that we see here is referred to as "distant reading", which involves the processing of large amounts of textual data without actually engaging in the reading of the text. While there are pros to this process, it skips over a human's perception of the words. An example used in the text was the word "sure": does this mean yes, or does it refer to the amount of confidence someone may have, like in the sentence "I am sure of it!"? On the positive side, this process can help identify large cultural trends in data, such as portrayed emotion in the Tweets used in the aforementioned project. But, it misses nuances in language that a human may be able to identify. My experience on this site was limited, as my computer would not let me visit the actual project. But from what I was able to view, I thought the larger bubbles of color indicating more frequent emotions were a very powerful piece of this project.

Comments

  1. Great quote: "...visualization should be treated with skepticism, rather than simply accepted". The emotional categories of YTT are intriguing, but don't necessarily work too well, we'll look at it tomorrow!

    ReplyDelete

Post a Comment

Popular posts from this blog

Blog Post 1: What is Digital Humanities?

Blog Post 7: Curation Team Planning