gov3009-l February 2020

gov3009-l@lists.fas.harvard.edu

1 participants
4 discussions

Gov 3009 Applied Statistics Workshop (02/26) - Shusei Eshima & Tomoya Sasaki

by Evans, Georgina

Hi all, Our next meeting will be *Wednesday February 26*, where Shusei Eshima and Tomoya Sasaki will present research* on “Keyword Assisted Topic Models”.* *Abstract:* For a long time, many social scientists have conducted a content analysis by simply counting carefully selected key words and phrases contained in documents of interest. In recent years, however, probabilistic topic models have become increasingly popular because of their ability to uncover topics and keywords based on the co-occurrence of certain words. Unfortunately, applied researchers find that these models often fail to yield topics of their interest by inadvertently creating nonsensical topics, merging unrelated topics, or splitting a single coherent topic. In this paper, we empirically demonstrate that providing topic models with a small number of keywords can dramatically improve their performance. The proposed keyword assisted topic model (keyATM) offers an important advantage that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post-hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our applications, we find that the keyATM provides more interpretable results, has better document classification performance, and is more robust to the number of topics than the standard topic models. Finally, keyATM can also model covariate effects and time trends. An open-source software package is freely available for implementing the proposed methodology. *Where:* CGIS Knafel Building, Room K354 (see this link <https://map.harvard.edu/?bld=04471&level=9> for directions). *When: *Wednesday, February 26 at 12noon - 1:30pm. All are welcome and lunch will be provided. Best, Georgie

4 years, 2 months

Gov 3009 Applied Statistics Workshop (02/17) - Asya Magazinnik

by Evans, Georgina

Hi all, Our next meeting will be *Wednesday February 19*, where Asya Magazinnik will present research* “What Do we Learn About Voter Preferences From Conjoint Experiments?”* *Abstract:* Political scientists frequently interpret the results of conjoint experiments as reflective of voter preferences. In this paper we show that the target esti- mand of conjoint experiments, the AMCE, is not well-defined in these terms. Even with individually rational experimental subjects, unbiased estimates of the AMCE can indicate the opposite of the true preference of the majority. To show this, we characterize the preference aggregation rule implied by AMCE and demonstrate its several undesirable properties. With this result we provide a method for placing sharp bounds on the proportion of experimental subjects with a strict preference for a given candidate-feature. We provide a testable assumption to show when the AMCE corresponds in sign with the majority preference. Finally, we offer a structural interpretation of the AMCE and highlight that the problem we describe persists even when a model of voting is imposed. The paper can be found here <https://scholar.princeton.edu/sites/default/files/kkocak/files/conjoint_dra…> . *Where:* CGIS Knafel Building, Room K354 (see this link <https://map.harvard.edu/?bld=04471&level=9> for directions). *When: *Wednesday, February 17 at 12noon - 1:30pm. All are welcome and lunch will be provided. Best, Georgie

4 years, 2 months

Gov 3009 Applied Statistics Workshop (02/12) - Adam Kapelner

by Evans, Georgina

Hi all, Our next meeting will be *Wednesday February 12*, where Adam Kapelner will present research on* “**Harmonizing Optimized Designs with Classic Randomization in Experiments**”. * *Abstract:* There is a long debate in experimental design between the classic randomization design of Fisher, Yates, Kempthorne, Cochran, and those who advocate deterministic assignments based on notions of optimality. In nonsequential trials comparing treatment and control, covariate measurements for each subject are known in advance, and subjects can be divided into two groups based on a criterion of imbalance. With the advent of modern computing, this partition can be made nearly perfectly balanced via numerical optimization, but these allocations are far from random. These perfect allocations may endanger estimation relative to classic randomization because unseen subject-specific characteristics can be highly imbalanced. To demonstrate this,we consider different performance criterions such as Efron’s worst-case analysis and our original tail criterion of mean squared error. Under our tail criterion for the differences-in-mean estimator, we prove asymptotically that the optimal design must be more random than perfect balance but is less random than completely random. Our result vindicates restricted designs that are used regularly such as blocking and rerandomization. For a covariate-adjusted estimator, balancing offers less rewards and it seems good performance is achievable with complete randomization. Further work will provide a procedure to find the explicit optimal design in different scenarios in practice. Supplementary materials for this article are available online. The paper can be found here <https://amstat.tandfonline.com/doi/abs/10.1080/00031305.2020.1717619#.Xj1kf…> . *Where:* CGIS Knafel Building, Room K354 (see this link <https://map.harvard.edu/?bld=04471&level=9> for directions). *When: *Wednesday, February 12 at 12 noon - 1:30 pm. As always, all are welcome and lunch will be provided. Best, Georgie

4 years, 2 months

Gov 3009 Applied Statistics Workshop (02/05) - Gary King

by Evans, Georgina

Hi all, Our next meeting will be *Wednesday February 5*, where Gary King will present research on* “Statistically Valid Inferences from Privacy Protected Data”.* *Abstract:* Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of worries about privacy violations. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for individuals who may be represented in the data, statistical guarantees for researchers seeking insights from it, and protection for society from some fallacious scientific conclusions. We build on the standard of "differential privacy'' but, unlike most such approaches, we also correct for the serious statistical biases induced by privacy-preserving procedures, provide a proper accounting for statistical uncertainty, and impose minimal constraints on the choice of data analytic methods and types of quantities estimated. Our algorithm is easy to implement, simple to use, and computationally efficient; we also offer open source software to illustrate all our methods. Slides and paper here <https://gking.harvard.edu/presentations/statistically-valid-inferences-priv…> . *Where:* CGIS Knafel Building, Room K354 (see this link <https://map.harvard.edu/?bld=04471&level=9> for directions). *When: *Wednesday, February 5 at 12 noon - 1:30 pm. All are welcome and lunch will be provided. Best, Georgie

4 years, 2 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

gov3009-l February 2020