Blog Post 3: Metadata and Databases

     Metadata is terms applied to data sets to describe what the data is providing information about. For example if you had data about books published in the 19th century, your data might include things like book title, author, and year published. These categories of data are metadata. Metadata is important because it allows data to be identified and interpreted. As stated in The Digital Humanities Coursebook, "Without metadata, information in files would be like books without covers or title pages on shelves without labels" (Drucker 53). When creating metadata schemes standardization is extremely important. While there are many words for one type of thing, when describing information using the same term for the same type of data will allow the data to be found later on with more ease. For example if in one instance you label something as "year published" and in another instance you label the same type of data as "date published" later on if you are searching for "year published" the data labeled as "date published" will be absent from the results. Standardizing language also helps data created by one group to be used by others. As a result many fields have their own standardized metadata schemes used to describe data.

    Databases are collections of structured data. Databases can be flat (single unconnected tables) or relational, "...a collection of tables connected to each other" (Drucker 71). Relational databases a created by linking information. In a flat database there would be a single column for each type of information. For example, author, book title, year published, shelf location. A relational database would have a table for each author who would be assigned a unique ID. This unique author ID would then be related in a second table of books. Each book would have and author ID associated with it and also its own unique book ID. This way you could have a third table for shelf location within a library. Each location would have its own ID or primary key, which would also be associated with a book ID or foreign key. This allows one piece of information to be connected with multiple others making data interpretation easier.

    Much like all other aspects of Digital Humanities creating metadata and databases involve human decisions of what goes in what category, which affects how the data is able to be used and could create biases when interpreting the data. This seems to be a common thread throughout all of the chapters thus far. When we begin doing curation projects it will be important to carefully consider the impacts of what labels are being placed on information and how that will affect the use of that information in the future.

    In regards to the project I am currently analyzing, Virtual Angkor , which is a virtual reality experience and 3-D model, geospatial data is a key element. Geospatial data uses the markup language KML. My understanding is that KML is used to tag information about were something is located on the earth's surface. This is a standardized method of creating metadata that can be read by a mapping program such as Google Earth. According to the ArcGIS Pro website, once a map is marked up with KML it can be read as a layer on top of an existing map.    

Comments

  1. I like how you emphasized the importance of metadata and how critical it is to organizing data. You did a great job of explaining the difference between data and metadata and why both play a key role in the organization of information. You provided good examples of both data and metadata, helping to describe why each relies upon one another. You also did a good job of explaining how the organization of data and metadata directly relates to the digital humanities projects that we are currently conducting research on and analyzing.

    ReplyDelete
  2. Yes! The awesome book quote again! And nice explanation of flat and relational databases. And, yes, "When we begin doing curation projects it will be important to carefully consider the impacts of what labels are being placed on information and how that will affect the use of that information in the future." Also glad you're digging into Geospatial data for your project!

    ReplyDelete
  3. I really appreciate how you provide examples in each term you describe. I specifically liked the example you provide that shows the difference between flat and relational databases. Also, the way you included data standardization and broke it down was very helpful in my understanding. For my project, Maine Sound and Story, data standardization is very important. They have an advanced search tool where you can filter through location, interviewer, collection, affiliation, and format. Data standardization is important as using similar terminology is more helpful for researchers trying to yield specific results. My project mostly uses descriptive data. The descriptive data has helped me in terms of finding what year the interview was conducted, the name of the interviewer and interviewee, and the location of the interview.

    ReplyDelete

Post a Comment

Popular posts from this blog

Blog Post 1: What is Digital Humanities?

What is Digital Humanities? Post #1 (Kira Littlefield)