My colleague Jelle Boumans and I just published an article outlining different approaches to automated content analysis. The focus lies on the application of computational social science approaches, which are often rooted in computer science, to the analysis of digital texts, especially journalistic texts. The article is part of a special issue with a lot of interesting related work (editorial).
The abstract of our article:
When analyzing digital journalism content, journalism scholars are confronted with a number of substantial differences compared to traditional journalistic content. The sheer amount of data and the unique features of digital content call for the application of valuable new techniques. Various other scholarly fields are already applying computational methods to study digital journalism data. Often, their research interests are closely related to those of journalism scholars. Despite the advantages that computational methods have over traditional content analysis methods, they are not commonplace in digital journalism studies. To increase awareness of what computational methods have to offer, we take stock of the toolkit and show the ways in which computational methods can aid journalism studies. Distinguishing between dictionary-based approaches, supervised machine learning, and unsupervised machine learning, we present a systematic inventory of recent applications both inside as well as outside journalism studies. We conclude with suggestions for how the application of new techniques can be encouraged.
Boumans, J.W., & Trilling, D. (2015). Taking Stock of the Toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars. Digital Journalism, online first. doi:10.1080/21670811.2015.1096598