Hi all, 

Our next meeting will be Wednesday February 26, where Shusei Eshima and Tomoya Sasaki will present research on “Keyword Assisted Topic Models”.

Abstract: For a long time, many social scientists have conducted a content analysis by simply counting carefully selected key words and phrases contained in documents of interest. In recent years, however, probabilistic topic models have become increasingly popular because of their ability to uncover topics and keywords based on the co-occurrence of certain words. Unfortunately, applied researchers find that these models often fail to yield topics of their interest by inadvertently creating nonsensical topics, merging unrelated topics, or splitting a single coherent topic. In this paper, we empirically demonstrate that providing topic models with a small number of keywords can dramatically improve their performance.  The proposed keyword assisted topic model (keyATM) offers an important advantage that the specification of keywords requires researchers to label topics prior to fitting a model to the data.  This contrasts with a widespread practice of post-hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our applications, we find that the keyATM provides more interpretable results, has better document classification performance, and is more robust to the number of topics than the standard topic models.  Finally, keyATM can also model covariate effects and time trends.  An open-source software package is freely available for implementing the proposed methodology.

Where: CGIS Knafel Building, Room K354 (see this link for directions). 

When: Wednesday, February 26 at 12noon - 1:30pm. 

All are welcome and lunch will be provided. 

Best, 
Georgie