Basic Topic Modeling vs Text Analysis

While text analysis and topic modeling are two distinct practices in the Digital Humanities, we were introduced to both in one session in our class this quarter. Both forms of analysis are a form of distant reading of a singular or corpus of texts.

Screen Shot 2016-03-24 at 10.29.09 PM – having difficulty loading a small .txt file

Voyant is an example of a simple distant reading tool that allows us to count terms in a text, their co-located terms, and visualize those frequencies in a variety of ways. Mallet is a tool that allows for detailed topic modeling, differentiated from text analysis by a focus on identifying topics, not just terms, of focus in the text itself.

Screen Shot 2016-03-24 at 10.16.52 PM

Voyant 2.0 Beta

Both tools interested me in regards to my own research – especially in regards to curricular standards used in content specialities in the K-12 context. Standards are generally the “what” of the classroom – they are used to narrow down the topics teachers focus on in any given year in any given course or class period. For this tutorial, I created a .txt file of the new 2015 AP US History curriculum guide, one which has resulted in a wide variety of push back, despite the many years it took to create. I’m also a part of a group doing both quantitative and qualitative analysis of these new standards from a critical perspective, based on previous work by our lead author Dr. Sarah B. Shear at Penn State.

While we did not decide to utilize the Mallet GUI or Voyant analysis, I was impressed with the interesting data to be gathered from these analyses, much on the same level as quantitative coding that each member of our team had to painstakingly code and then run descriptive statistics on. Unfortunately, without the fully functional Voyant 2 we were not able to get the contextual information we needed for our analysis.

Screen Shot 2016-03-24 at 10.34.31 PM

Screen Shot 2016-03-24 at 10.34.39 PM

Mallet GUI

Topic modeling with a Mallet GUI provided by Dr. Zoe Borovsky in our class tutorial resulted in far more interesting analysis for this document, including tying teachers to the work of history, rather than students Рwhich is supposedly what the standards attempted to ameliorate. While I would need to do further analysis and possibly utilize the full Mallet package, this did give me an interesting topic to explore in qualitative analysis that I might not have otherwise noticed.