Hi all,
Our next meeting will be *Wednesday February 26*, where Shusei Eshima and
Tomoya Sasaki will present research* on “Keyword Assisted Topic Models”.*
*Abstract:* For a long time, many social scientists have conducted a
content analysis by simply counting carefully selected key words and
phrases contained in documents of interest. In recent years, however,
probabilistic topic models have become increasingly popular because of
their ability to uncover topics and keywords based on the co-occurrence of
certain words. Unfortunately, applied researchers find that these models
often fail to yield topics of their interest by inadvertently creating
nonsensical topics, merging unrelated topics, or splitting a single
coherent topic. In this paper, we empirically demonstrate that providing
topic models with a small number of keywords can dramatically improve their
performance. The proposed keyword assisted topic model (keyATM) offers an
important advantage that the specification of keywords requires researchers
to label topics prior to fitting a model to the data. This contrasts with
a widespread practice of post-hoc topic interpretation and adjustments that
compromises the objectivity of empirical findings. In our applications, we
find that the keyATM provides more interpretable results, has better
document classification performance, and is more robust to the number of
topics than the standard topic models. Finally, keyATM can also model
covariate effects and time trends. An open-source software package is
freely available for implementing the proposed methodology.
*Where:* CGIS Knafel Building, Room K354 (see this link
<https://map.harvard.edu/?bld=04471&level=9> for directions).
*When: *Wednesday, February 26 at 12noon - 1:30pm.
All are welcome and lunch will be provided.
Best,
Georgie
Show replies by date