gov3009-l April 2023

gov3009-l@lists.fas.harvard.edu

1 participants
4 discussions

GOV 3009 (Applied Stats Workshop), 4/26 Dean Knox and Guilherme Duarte

by Shusei Eshima

Dear Applied Statistics Workshop Community, Our next meeting of the semester will be on April 26 (12:00 EST). Dean Knox and Guilherme Duarte will present "Optimal Allocation of Data-Collection Resources." <Where> CGIS K354 Bagged lunches are available for pick-up at 11:45 (CGIS K354). Zoom: https://harvard.zoom.us/j/99181972207?pwd=Ykd3ZzVZRnZCSDZqNVpCSURCNnVvQT09 <Abstract> Complications in applied work often prevent researchers from obtaining unique point estimates of target quantities using cheaply available data—at best, ranges of possibilities, or sharp bounds, can be reported. To make progress, researchers frequently collect more information by (1) re-cleaning existing datasets, (2) gathering secondary datasets, or (3) pursuing entirely new designs. Common examples include manually correcting missingness, recontacting attrited units, validating proxies with ground-truth data, finding new instrumental variables, and conducting follow-up experiments. These auxiliary tasks are costly, forcing tradeoffs with (4) larger samples from the original approach. Researchers' data-collection strategies, or choices over these tasks, are often based on convenience or intuition. In this work, we show how to provably identify the most cost-efficient data-collection strategy for a given research problem. We quantify the quality of existing data using the width of the confidence regions on the sharp bounds, which captures two sources of uncertainty: statistical uncertainty due to finite samples of the variables measured, and fundamental uncertainty because some variables are not measured at all. We then show how to compute the expected information gain, defined as the expected amount by which each data-collection task will narrow these bounds by addressing one or both sources of uncertainty. Finally, we select the task with the greatest information efficiency, or gain per unit cost. Leveraging recent advances in automatic bounding (Duarte et al., 2022), we prove efficiency is computable for essentially any discrete causal system, estimand, and auxiliary data task. Based on this theoretical framework, we develop a method for optimal adaptive allocation of data-collection resources. Users first input a causal graph, estimand, and past data. They then enumerate distributions from which future samples can be drawn, fixed and per-sample costs, and any prior beliefs. Our method automatically derives and sequentially updates the optimal data-collection strategy. <2022-2023 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/embed?src=c_3v93pav9fjkkldrbu9snbhned8… Best, Shusei

1 year

GOV 3009 (Applied Stats Workshop), 4/19 Michela Carlana

by Shusei Eshima

Dear Applied Statistics Workshop Community, Our next meeting of the semester will be on April 19 (12:00 EST). Michela Carlana will present "Revealing Stereotypes: Evidence from Immigrants in Schools." <Where> CGIS K354 Bagged lunches are available for pick-up at 11:45 (CGIS K354). Zoom: https://harvard.zoom.us/j/99181972207?pwd=Ykd3ZzVZRnZCSDZqNVpCSURCNnVvQT09 <Abstract> We study how people change their behavior after learning they are biased. Teachers in Italian schools give lower grades to immigrant students relative to natives with comparable ability. In two experiments, we reveal to teachers their own bias, measured by an Implicit Association Test (IAT). Randomizing the timing of disclosure, we find that learning one’s IAT before deciding end-of-term grades reduces the native-immigrant gap in grades. IAT disclosure and generic debiasing have similar average effects, but there is heterogeneity: teachers with more negative stereotypes do not respond to generic debiasing but change their behavior when informed about their own IAT. <2022-2023 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/embed?src=c_3v93pav9fjkkldrbu9snbhned8… Best, Shusei

1 year

GOV 3009 (Applied Stats Workshop), 4/12 Naoki Egami

by Shusei Eshima

Dear Applied Statistics Workshop Community, Our next meeting of the semester will be on April 12 (12:00 EST). Naoki Egami will present "Empirical Strategies Toward External Validity: Framework and External Robustness." <Where> CGIS K354 Bagged lunches are available for pick-up at 11:45 (CGIS K354). Zoom: https://harvard.zoom.us/j/99181972207?pwd=Ykd3ZzVZRnZCSDZqNVpCSURCNnVvQT09 <Abstract> Over the last few decades, social scientists have developed and applied a host of statistical methods to make valid causal inferences, known as the credibility revolution. This trend has primarily focused on internal validity — researchers aim to unbiasedly estimate causal effects within a study. However, one of the most important long-standing methodological debates is about external validity — how scientists can generalize causal findings beyond a specific study. This question of external validity has a long history in the social sciences, going back to at least the 1960s, and it has recently become even more essential, given that huge opportunities and challenges of accumulating causal knowledge have become evident. In this talk, I will discuss a set of empirical strategies to improve external validity in practice. I briefly introduce a formal framework of external validity (Egami and Hartman, 2022; APSR) that synthesizes diverse external validity concerns. Then, I will propose a new simple approach to quantify the robustness of experimental results to external validity bias (Devaux and Egami, 2022; Egami and Rothenhäusler, 2023+). In particular, I introduce a measure of external robustness, which ranges from 0 to 1 and represents how well causal effects estimated in one’s study can be generalized to other populations and contexts. Researchers can estimate this quantity using only experimental data (i.e., no additional data collection), and users can also account for unmeasured confounders. I discuss a debiased estimator, which is consistent and asymptotically normal under mild rate conditions that allow for the use of machine learning estimators. Finally, I provide default benchmarks and discuss practical guides about how to report external robustness in practice using the R package “exr” (https://github.com/naoki-egami/exr). Papers: (1) https://naokiegami.com/paper/external_full.pdf (2) https://naokiegami.com/paper/external_robust.pdf <2022-2023 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/embed?src=c_3v93pav9fjkkldrbu9snbhned8… Best, Shusei

1 year

GOV 3009 (Applied Stats Workshop), 4/5 Fredrik Sävje

by Shusei Eshima

Dear Applied Statistics Workshop Community, Our next meeting of the semester will be on April 5 (12:00 EST). Fredrik Sävje will present "A Design-Based Riesz Representation Framework for Randomized Experiments." <Where> CGIS K354 Bagged lunches are available for pick-up at 11:45 (CGIS K354). Zoom: https://harvard.zoom.us/j/99181972207?pwd=Ykd3ZzVZRnZCSDZqNVpCSURCNnVvQT09 <Abstract> We describe a new design-based framework for drawing causal inference in randomized experiments. Causal effects in the framework are defined as linear functionals evaluated at potential outcome functions. Knowledge and assumptions about the potential outcome functions are encoded as function spaces. This makes the framework expressive, allowing experimenters to formulate and investigate a wide range of causal questions. We describe a class of estimators for estimands defined using the framework and investigate their properties. The construction of the estimators is based on the Riesz representation theorem. We provide necessary and sufficient conditions for unbiasedness and consistency. Finally, we provide conditions under which the estimators are asymptotically normal, and describe a conservative variance estimator to facilitate the construction of confidence intervals for the estimands. Paper: https://arxiv.org/abs/2210.08698 <2022-2023 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/embed?src=c_3v93pav9fjkkldrbu9snbhned8… Best, Shusei

1 year

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

gov3009-l April 2023