gov3009-l

gov3009-l@lists.fas.harvard.edu

1 participants
716 discussions

GOV 3009 (Applied Stats Workshop), 5/1 -- Kosuke Imai

by Li, Jialu

Dear Applied Statistics Workshop Community, Our last meeting of this semester will be on Wednesday, May 1 (12:00 EST). Kosuke Imai presents "Does AI help humans make better decisions? A methodological framework for experimental evaluation" (joint work with Eli Ben-Michael, D. James Greiner, Melody Huang, Zhichao Jiang, Sooahn Shin). <When> May 1, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human-alone or AI-alone system. We introduce a new methodological framework that can be used to answer experimentally this question with no additional assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded experimental design, in which the provision of AI-generated recommendations is randomized across cases with a human making final decisions. Under this experimental design, we show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We apply the proposed methodology to the data from our own randomized controlled trial of a pretrial risk assessment instrument. We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail. Our analysis also shows that AI-alone decisions generally perform worse than human decisions with or without AI assistance. Finally, AI recommendations tend to impose cash bail on non-white arrestees more often than necessary when compared to white arrestees. The paper is available on arXiv: https://arxiv.org/pdf/2403.12108.pdf <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

2 days

GOV 3009 (Applied Stats Workshop), 4/24 -- Siyu Heng

by Li, Jialu

Dear Applied Statistics Workshop Community, Our next meeting will be on Wednesday, April 24 (12:00 EST). Siyu Heng presents "Design-Based Causal Inference with Missing Outcomes: Missingness Mechanisms, Imputation-Assisted Randomization Tests, and Covariate Adjustment." <When> April 24, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> Design-based causal inference, also known as randomization-based or finite-population causal inference, is one of the most widely used causal inference frameworks, largely due to the merit that its statistical validity can be guaranteed by the study design (e.g., randomized experiments) and does not require assuming specific outcome-generating distributions or super-population models. Despite its advantages, design-based causal inference can still suffer from other data-related issues, among which outcome missingness is a prevalent and significant challenge. This work systematically studies the outcome missingness problem in design-based causal inference. First, we propose a general and flexible outcome missingness mechanism that can facilitate finite-population-exact randomization tests for the null effect. Second, under this flexible missingness mechanism, we propose a general framework called "imputation and re-imputation" for conducting finite-population-exact randomization tests in design-based causal inference with missing outcomes. This framework can incorporate any imputation algorithms (from linear models to advanced machine learning-based imputation algorithms) while ensuring finite-population-exact type-I error rate control. Third, we extend our framework to conduct covariate adjustment in randomization tests and construct finite-population-valid confidence sets with missing outcomes. Our framework is evaluated via extensive simulation studies and applied to a cluster randomized experiment called the Work, Family, and Health Study. Open-source Python and R packages "iArt" (*i*mputation-*A*ssisted *r* andomization *t*est) are developed for implementation of our framework. This talk is based on joint work with Yang Feng and Jiawei Zhang. The working paper is available on arXiv: https://arxiv.org/abs/2310.18556 <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

4 days, 18 hours

GOV 3009 (Applied Stats Workshop), 4/17 -- Connor Jerzak

by Li, Jialu

Dear Applied Statistics Workshop Community, Just a quick reminder, our next meeting is Wednesday, April 17 (12:00 EST). Connor Jerzak presents "Selecting Optimal Candidate Profiles in Adversarial Environments Using Conjoint Analysis" (Joint with Kosuke Imai). <When> April 17, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> Conjoint analysis, an application of factorial experimental design, is a popular tool in social science research for studying multidimensional preferences. In such experiments in the political analysis context, respondents are asked to choose between two hypothetical political candidates with randomly selected features, which can include partisanship, policy positions, gender and race. We consider the problem of identifying optimal candidate profiles. Because the number of unique feature combinations far exceeds the total number of observations in a typical conjoint experiment, it is impossible to determine the optimal profile exactly. To address this identification challenge, we derive an optimal stochastic intervention that represents a probability distribution of various attributes aimed at achieving the most favorable average outcome. We first consider an environment where one political party optimizes their candidate selection. We then move to the more realistic case where two political parties optimize their own candidate selection simultaneously and in opposition to each other. We apply the proposed methodology to an existing candidate choice conjoint experiment concerning vote choice for US president. We find that, in contrast to the non-adversarial approach, expected outcomes in the adversarial regime fall within range of historical electoral outcomes, with optimal strategies suggested by the method more likely to match the actual observed candidates compared to strategies derived from a non-adversarial approach. These findings indicate that incorporating adversarial dynamics into conjoint analysis may yield unique insight into social science data from experiments. <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

1 week, 5 days

GOV 3009 (Applied Stats Workshop), 4/10 -- Melissa Dell

by Li, Jialu

Dear Applied Statistics Workshop Community, Just a quick reminder, our next meeting is Wednesday, April 10 (12:00 EST). Melissa Dell presents “Efficient OCR for Building a Diverse Digital History.” <When> April 10, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> Thousands of users consult digital archives daily, but the information they can access is unrepresentative of the diversity of documentary history. The sequence-to-sequence architecture typically used for optical character recognition (OCR) – which jointly learns a vision and language model - is poorly extensible to low-resource document collections, as learning a language-vision model requires extensive labeled sequences and compute. This study models OCR as a character level image retrieval problem, using a contrastively trained vision encoder. Because the model only learns characters’ visual features, it is more sample efficient and extensible than existing architectures, enabling accurate OCR in settings where existing solutions fail. Crucially, the model opens new avenues for community engagement in making digital history more representative of documentary history. Beyond OCR, the presentation will also discuss how large differences in sample efficiency across different neural network architectures influence the types of learning that are often most suited towards academic applications, particular for low resource settings. <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

2 weeks, 5 days

GOV 3009 (Applied Stats Workshop), 4/3 -- Zeyang Yu

by Li, Jialu

Dear Applied Statistics Workshop Community, Just a quick reminder, our next meeting is Wednesday, April 3 (12:00 EST). Zeyang Yu will present "A Binary IV Model for Persuasion: Profiling Persuasion Types among Compliers." <When> April 3, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> In the empirical study of persuasion, researchers often use a binary instrument to encourage individuals to consume information and take some action. We show that with the Imbens-Angrist instrumental variable model assumptions and the monotone treatment response assumption, it is possible to identify the joint distributions of potential outcomes among compliers. This is necessary to identify the percentage of persuaded individuals and their statistical characteristics. Specifically, we develop a weighting method that helps researchers identify the statistical characteristics of persuasion types: compliers and always-persuaded, compliers and persuaded, and compliers and never-persuaded. These findings extend the ”κ weighting” results in Abadie (2003). We also provide a sharp test on the two sets of identification assumptions. The test boils down to testing whether there exists a nonnegative solution to a possibly under-determined system of linear equations with known coefficients. An application based on Green et al. (2003) is provided. The result shows that among compliers, roughly 10% voters are persuaded. The results are consistent with the findings that voters’ voting behaviors are highly persistent. Link to the paper: yu_2023local.pdf <https://arthurzeyangyu.github.io/jmp/yu_2023local.pdf> <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

3 weeks, 5 days

GOV 3009 (Applied Stats Workshop), 3/27 -- Shuangning Li

by Li, Jialu

Dear Applied Statistics Workshop Community, Our next meeting will be on Wednesday, March 27 (12:00 EST). Shuangning Li presents "Experimenting under Stochastic Congestion." <When> March 27, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> We study randomized experiments in a service system when stochastic congestion can arise from temporarily limited supply and/or demand. Such congestion gives rise to cross-unit interference between the waiting customers, and analytic strategies that do not account for this interference may be biased. In current practice, one of the most widely used ways to address stochastic congestion is to use switchback experiments that alternatively turn a target intervention on and off for the whole system. We find, however, that under a queueing model for stochastic congestion, the standard way of analyzing switchbacks is inefficient, and that estimators that leverage the queueing model can be materially more accurate. We also consider a new class of experimental design, which can be used to estimate a policy gradient of the dynamic system using only unit-level randomization, thus alleviating key practical challenges that arise in running a switchback. <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

1 month

GOV 3009 (Applied Stats Workshop), 3/20 -- Anton Strezhnev

by Li, Jialu

Dear Applied Statistics Workshop Community, Our next meeting will be on Wednesday, March 20 (12:00 EST). Anton Strezhnev presents "A Guide to Dynamic Difference-in-Differences Regressions for Political Scientists." <When> March 20, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> Difference-in-differences (DiD) designs for estimating causal effects have grown in popularity throughout political science. Many DiD papers present their central results through an "event study" plot - a visualization that combines estimated dynamic average treatment effects for multiple post-treatment time periods alongside placebo tests of the main identifying assumption: parallel trends. Despite their ubiquity, the methods used in practice for the creation of these plots are not standardized and in many cases the approaches adopted by researchers can result in misleading inferences about both the treatment effects and the placebo tests. Building on and synthesizing recent contributions in the econometric literature on differences-in-differences designs, this paper illustrates some common pitfalls through a replication of three recently published papers in major political science journals. We identify three notable problems related to the incorrect specification of the baseline comparison time, incorrect inclusion of "always-treated" units, and sensitivity to effect homogeneity assumptions. We help provide researchers with additional intuition for the problems that arise due to effect heterogeneity and for the "contamination bias" result of Sun and Abraham (2021) through a novel decomposition of the dynamic event study regression in the style of Goodman-Bacon (2021) that allows researchers to recover the weights placed on each 2x2 comparison used to construct the effect estimates and placebos. These weights allow researchers to gauge the sensitivity of each estimate to potential effect heterogeneity. Anton is happy to meet with students and faculty after the talk. Please reach out to Jialu directly if you want to schedule 1:1 meetings with him. <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

1 month, 1 week

GOV 3009 (Applied Stats Workshop), 3/6 -- Amanda Coston

by Li, Jialu

Dear Applied Statistics Workshop Community, Our next meeting will be on Wednesday, March 6 (12:00 EST). Amanda Coston presents "Addressing confounding in decision-making algorithms." <When> March 6, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> Machine learning algorithms are used for decision-making in societally high-stakes settings from child welfare and criminal justice to healthcare and consumer lending. These algorithms are often intended to predict outcomes under a proposed decision. It is challenging to evaluate how well these algorithms perform because we only observe the relevant outcome under a biased sample of the population. In this talk, we explore how to use techniques from causal inference to estimate performance on the full population. We will consider several strategies to account for confounding factors that affect the decision and the outcome. First, we study runtime confounding where all relevant factors are captured in the historical data, but it is either undesirable or impermissible to use some such factors in the prediction model. Second, we study the setting with unobserved confounders where we can bound the degree to which the outcome varies on average between units receiving different decisions conditional on observed covariates and identified nuisance parameters. We develop debiased machine learning estimators for the learning target and predictive performance estimands under both settings. We present empirical results in the consumer lending and child welfare domains. Papers: arxiv:2212.09844 <https://arxiv.org/abs/2212.09844> and arxiv:2006.16916 <https://arxiv.org/abs/2006.16916>. <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

1 month, 3 weeks

GOV 3009 (Applied Stats Workshop), 2/28 -- Phillip Heiler

by Li, Jialu

Dear Applied Statistics Workshop Community, Our next meeting will be on Wednesday, February 28 (12:00 EST). Phillip Heiler presents "Heterogeneous Treatment Effect Bounds under Sample Selection with an Application to the Effects of Social Media on Political Polarization." <When> February 28, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> We propose a method for estimation and inference for bounds for heterogeneous causal effect parameters in general sample selection models where the treatment can affect whether an outcome is observed and no exclusion restrictions are available. The method provides conditional effect bounds as functions of policy relevant pre-treatment variables. It allows for conducting valid statistical inference on the unidentified conditional effects. We use a flexible debiased/double machine learning approach that can accommodate non-linear functional forms and high-dimensional confounders. Easily verifiable high-level conditions for estimation, misspecification robust confidence intervals, and uniform confidence bands are provided as well. We re-analyze data from a large-scale field experiment on Facebook on counter-attitudinal news subscription with attrition. Our method yields substantially tighter effect bounds compared to conventional methods and suggests depolarization effects for younger users. The paper is available on arXiv:https://arxiv.org/abs/2209.04329 <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University *https://jialul.github.io/ <https://jialul.github.io/>*

2 months

GOV 3009 (Applied Stats Workshop), 2/21 -- Ross Mattheis

by Li, Jialu

Dear Applied Statistics Workshop Community, Our next meeting will be on Wednesday, February 21 (12:00 EST). Ross Mattheis presents "Spurious Mobility in Imperfectly Linked Data Trials" (joint with Jiafeng Chen). <When> February 21, 12:00 to 1:30 PM, EST Lunch will be available for pick-up inside CGIS K354. <Where> In-person: CGIS K354 Zoom: https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09 <Abstract> Estimating intergenerational mobility often requires linking data across multiple sources. However, mistakes in record linkage can introduce biases in subsequent estimates. This paper re-examines the history of intergenerational mobility in the United States with emphasis on bias from imperfectly linked data. In particular, data corrupted by incorrect links will typically attenuate estimates of linear estimands towards zero. When the estimand is the intergenerational elasticity of status, this bias will tend to exaggerate levels of mobility. We propose two complementary methods to address bias from imperfectly linked data. Building on a large literature on Bayesian entity resolution, our first approach samples from a convenience prior and reports the ratio of the posterior and implicit prior distributions for the target parameter. Our second approach takes advantage of the availability of repeated measurements and identification results in settings with misclassified data due to Hu (2008). Consistent with bias from data-corruption, our estimates suggest that levels of mobility in the U.S. were lower than previously believed, with conventional estimates of the father-son elasticity of occupation status 10% to 40% lower than our estimates. The gap between ours and conventional estimates is largest in the mid-nineteenth century and declines in more recent years, resulting in relatively stable levels of mobility over the period. <2023-2024 Schedule> GOV 3009 Website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009 Calendar: https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm… Best, Jialu -- Jialu Li Department of Government Harvard University jialu_li(a)g.harvard.edu

2 months, 1 week

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

gov3009-l