gov3009-l April 2018

gov3009-l@lists.fas.harvard.edu

1 participants
4 discussions

by Dana Higgins

Hi everyone! This week (the last meeting of the semester!) at the Applied Statistics Workshop we will be welcoming *Jeff Gill*, Professor of Statistics and Government at American University. He will be presenting work entitled *Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data*. Please find the abstract below and on the Applied Stats website here <https://projects.iq.harvard.edu/applied.stats.workshop-gov3009>. As usual, we will meet at noon in CGIS Knafel Room 354 and lunch will be provided. See you all there! -- Dana Higgins *Title:* *Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data * *Abstract:* Unseen grouping, often called latent clustering, is a common feature in social science data. Subjects may intentionally or unintentionially group themselves in ways that complicate the statistical analysis of substantively important relationships. This work introduces a new model-based clustering design which incorporates two sources of heterogeneity. The first source is a random effect that introduces substantively unimportant grouping but must be accounted-for. The second source is more important and more difficult to handle since it is directly related to the relationships of interest in the data. We develop a model to handle both of these challenges and apply it to data on terrorist groups, which are notoriously hard to model with conventional tools.

6 years

Applied Stats 4/18

by Dana Higgins

Hi everyone! This week at the Applied Statistics Workshop we will be welcoming *Xiang Zhou*, Professor of Government at Harvard University. He will be presenting work entitled *Two residual-based methods to adjust for treatment-induced confounding in causal inference*. Please find the abstract below and on the Applied Stats website here <https://projects.iq.harvard.edu/applied.stats.workshop-gov3009>. As usual, we will meet at noon in CGIS Knafel Room 354 and lunch will be provided. See you all there! -- Dana Higgins *Title:* *Two residual-based methods to adjust for treatment-induced confounding in causal inference * *Abstract:* Treatment-induced confounding arises in both causal inference of time-varying treatments and causal mediation analysis where post-treatment variables affect both the mediator and outcome. Existing methods to adjust for treatment-induced confounding include, among others, Robins's structural nest mean model (SNMM) with its g-estimation and marginal structural models (MSM) with inverse probability weighting (IPW). In this talk, I describe two alternative methods, one called "regression-with-residuals" (RWR) and the other called "residual balancing," for estimating the marginal means of potential outcomes. The RWR method is a simple extension of Almirall et al.'s (2010) two-stage estimator for studying effect moderation to the estimation of marginal effects. In special cases, it is equivalent to Vansteelandt's (2009) sequential g-estimator for estimating controlled direct effects. The residual balancing method, on the other hand, can be considered a generalization of Hainmueller's (2012) entropy balancing method to time-varying settings. Numeric simulations show that the residual balancing method tends to be more efficient and more robust than IPW in a variety of settings.

6 years

Applied Statistics 4/11

by Dana Higgins

Hi everyone! This week at the Applied Statistics Workshop we will be welcoming *Michael Windzio*, Professor of Sociology at the University of Bremen. He will be presenting work entitled *Does schoolwork cooperation improve pupils’ grades and well-being in school? Results from social network and propensity score analysis*. Please find the abstract below and on the Applied Stats website here <https://projects.iq.harvard.edu/applied.stats.workshop-gov3009>. As usual, we will meet at noon in CGIS Knafel Room 354 and lunch will be provided. See you all there! -- Dana Higgins *Title:* *Does schoolwork cooperation improve pupils’ grades and well-being in school? Results from social network and propensity score analysis * *Abstract:* Using panel data of school-class networks and outcomes of 11-13-year-old students, effects of collaboration in schoolwork networks on grades and school-related well-being will be investigated. The analysis might suffer from endogeneity-bias because pupils actively select their peers also with regard to their school-performance. This selectivity will be demonstrated by using p* models for ties in schoolwork-networks at t1 based data of 1,289 pupils in 76 classrooms. Predictions from this model will be used to generate propensity scores. Stochastic actor-based models (SOAM) for the co-evolution of networks and behavior/attitudes (N=244, k= 10) result in a systematic loss of data, whereas propensity score matching appropriately limits the data to the area of common support. However, violation of the SUTVA requires that indicators of network embeddedness are controlled, which can be done in a propensity score weighting regression. Overall, results of SOAMs and propensity score matching suggest that schoolwork networks do not have significantly positive effects, neither on grades nor on well-being.

6 years

Applied Statistics 4/4

by Dana Higgins

Hi everyone! This week at the Applied Statistics Workshop we will be welcoming *Francesca Dominici*, Professor of Statistics at the Harvard School of Public Health and Co-Director of the Harvard Data Science Initiative. She will be presenting work entitled *Data Science and Our Environment*. Please find the abstract below and on the Applied Stats website here <https://projects.iq.harvard.edu/applied.stats.workshop-gov3009>. As usual, we will meet at noon in CGIS Knafel Room 354 and lunch will be provided. See you all there! -- Dana Higgins *Title:* * Data Science and Our Environment * *Abstract:* What if I told you I had evidence of a serious threat to American national security – a terrorist attack in which a jumbo jet will be hijacked and crashed every 12 days. Thousands will continue to die unless we act now. This is the question before us today – but the threat doesn’t come from terrorists. The threat comes from climate change and air pollution. We have developed an artificial neural network model that uses on-the- ground air-monitoring data and satellite-based measurements to estimate daily pollution levels across the continental U.S., breaking the country up into 1-square- kilometer zones. We have paired that information with health data contained in Medicare claims records from the last 12 years, and for 97% of the population ages 65 or older. We have developed statistical methods and computational efficient algorithms for the analysis over 460 million health records. Our research shows that short and long term exposure to air pollution is killing thousands of senior citizens each year. This data science platform is telling us that federal limits on the nation’s most widespread air pollutants are not stringent enough. This type of data is the sign of a new era for the role of data science in public health, and also for the associated methodological challenges. For example, with enormous amounts of data, the threat of unmeasured confounding bias is amplified, and causality is even harder to assess with observational studies. These and other challenges will be discussed.

6 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

gov3009-l April 2018