Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
March 30*, where Hannah Druckenmiller
<https://hannahdruckenmiller.com> (Resources
for the Future) presents "Accounting for Unobservable Heterogeneity in
Cross Section Using Spatial First Differences."
Please note that this meeting will be *entirely on Zoom
<https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09>*
.
*Abstract*
We develop a simple cross-sectional research design to identify causal
effects that is robust to unobservable heterogeneity. When many
observational units are dense in physical space, it may be sufficient to
regress the “spatial first differences” (SFD) of the outcome on the
treatment and omit all covariates. This approach is conceptually similar to
first differencing approaches in time-series or panel models, except the
index for time is replaced with an index for locations in space. The SFD
design identifies plausibly causal effects, so long as local changes in the
treatment and unobservable confounders are not systematically correlated
between immediately adjacent neighbors. We demonstrate the SFD approach by
recovering new cross-sectional estimates for the effects of time-invariant
geographic factors, soil and climate, on long-run average crop
productivities across US counties — relationships that are notoriously
confounded by unobservables but crucial for guiding economic decisions,
such as land management and climate policy.
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
*When:* Wednesday, March 30 at 12:10 - 1:30 pm.
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
March 23*, where José R. Zubizarreta <http://jrzubizarreta.com/> (Harvard
University) presents "Bridging Matching, Regression, and Weighting as
Mathematical Programs for Causal Inference."
*Abstract*
A fundamental principle in the design of observational studies is to
approximate the randomized experiment that would have been conducted under
controlled circumstances. Across the health and social sciences,
statistical methods for covariate adjustment are used in pursuit of this
principle. Typical methods are matching, regression, and weighting. In this
talk, we will examine the connections between these methods through their
underlying mathematical programs. We will study their strengths and
weaknesses in terms of study design, computational tractability, and
statistical efficiency. Overall, we will discuss the role of mathematical
optimization for the design and analysis of studies of causal effects.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, March 23 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
March 9*, where Iavor Bojinov <https://www.ibojinov.com> (Harvard
University) presents "Design and Analysis of Switchback Experiments."
*Abstract*
Switchback experiments, where a firm sequentially exposes an experimental
unit to random treatments, are among the most prevalent designs used in the
technology sector, with applications ranging from ride-hailing platforms to
online marketplaces. Although practitioners have widely adopted this
technique, the derivation of the optimal design has been elusive, hindering
practitioners from drawing valid causal conclusions with enough statistical
power. We address this limitation by deriving the optimal design of
switchback experiments under a range of different assumptions on the order
of the carryover effect --- the length of time a treatment persists in
impacting the outcome. We cast the optimal experimental design problem as a
minimax discrete optimization problem, identify the worst-case adversarial
strategy, establish structural results, and solve the reduced problem via a
continuous relaxation. For switchback experiments conducted under the
optimal design, we provide two approaches for performing inference. The
first provides exact randomization based $p$-values, and the second uses a
new finite population central limit theorem to conduct conservative
hypothesis tests and build confidence intervals. We further provide
theoretical results when the order of the carryover effect is misspecified
and provide a data-driven procedure to identify the order of the carryover
effect. We conduct extensive simulations to study the numerical performance
and empirical properties of our results, and conclude with practical
suggestions.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, March 9 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
March 2*, where Sharad Goel <https://5harad.com> (Harvard University)
presents "Designing Equitable Algorithms for Criminal Justice and Beyond."
*Abstract*
Machine learning algorithms are now used to automate routine tasks and to
guide high-stakes decisions, but, if not carefully designed, they can
exacerbate inequities. I’ll start by describing an evaluation of automated
speech recognition (ASR) tools, which power popular virtual assistants,
facilitate automated closed captioning, and enable digital dictation
platforms for health care. We find that five state-of-the-art ASR systems
-- developed by Amazon, Apple, Google, IBM, and Microsoft -- exhibited
substantial racial disparities, making twice as many errors for Black
speakers compared to white speakers, a gap we trace back to a lack of
diversity in the audio data used to train the models. I'll then describe
recent attempts to mathematically formalize fairness. I'll argue that some
of the most popular definitions, when used as a design principle, can,
perversely, harm the very groups they were created to protect. I'll
conclude by describing a general, consequentialist paradigm for designing
equitable algorithms that aims to mitigate the limitations of the dominant
approaches to building fair machine learning systems.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, March 2 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
February 23*, where Soroush Saghafian
<https://scholar.harvard.edu/saghafian/home> (Harvard University) presents
"Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning
Approach." The full paper is available here
<https://scholar.harvard.edu/saghafian/publications/ambiguous-dynamic-treatm…>
.
*Abstract*
A main research goal in various studies is to use an observational data set
and provide a new set of counterfactual guidelines that can yield causal
improvements. Dynamic Treatment Regimes (DTRs) are widely studied to
formalize this process and enable researchers to find guidelines that are
both personalized and dynamic. However, available methods in finding
optimal DTRs often rely on assumptions that are violated in real-world
applications (e.g., medical decision-making or public policy), especially
when (a) the existence of unobserved confounders cannot be ignored, and (b)
the unobserved confounders are time-varying (e.g., affected by previous
actions). When such assumptions are violated, one often faces ambiguity
regarding the underlying causal model that is needed to be assumed to
obtain an optimal DTR. This ambiguity is inevitable, since the dynamics of
unobserved confounders and their causal impact on the observed part of the
data cannot be understood from the observed data. Motivated by a case study
of finding superior treatment regimes for patients who underwent
transplantation in our partner hospital and faced a medical condition known
as New Onset Diabetes After Transplantation (NODAT), we propose a new
framework termed Ambiguous Dynamic Treatment Regimes (ADTRs), in which the
casual impact of treatment regimes is evaluated based on a “cloud” of
potential causal models. We then connect ADTRs to Ambiguous Partially
Observable Mark Decision Processes (APOMDPs) proposed by Saghafian (2018),
and consider unobserved confounders as latent variables but with ambiguous
dynamics and causal effects on observed variables. Using this connection,
we develop two Reinforcement Learning methods termed Direct Augmented
V-Learning (DAV-Learning) and Safe Augmented V-Learning (SAV-Learning),
which enable using the observed data to efficiently learn an optimal
treatment regime. We establish theoretical results for these learning
methods, including (weak) consistency and asymptotic normality. We further
evaluate the performance of these learning methods both in our case study
(using clinical data) and in simulation experiments (using synthetic data).
We find promising results for our proposed approaches, showing that they
perform well even compared to an imaginary oracle who knows both the true
causal model (of the data generating process) and the optimal regime under
that model.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, February 23 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
February 16*, where Edward McFowland III
<https://www.hbs.edu/faculty/Pages/profile.aspx?facId=772797> (Harvard
University) presents "Anomalous Pattern Detection: A Novel Lens for
Scientific Inquiry."
*Abstract*
There has been a growing interest in the use of machine learning methods
for causal inference, which often involves adjusting or reappropriating
predictive models, with causality in mind. As an alternative, anomaly
detection methods offer a unique lens through which to conduct causal
inference, as the presence of a causal effect results in treatment group
units that appear anomalous in comparison to the control group. Moreover,
anomalous pattern detection intentionally localizes the presence of
treatment effects, which has tremendous value when the ultimate goal
involves hypothesis generation, understanding causal mechanisms, or
targeting subpopulations. As motivation, we will consider the
identification of subpopulations in randomized experiments with extremely
significant effects, and will consider other quasi-experimental settings as
time permits.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, February 16 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
February 9*, where Tyler VanderWeele
<https://www.hsph.harvard.edu/tyler-vanderweele/> (Harvard University)
presents "The Global Flourishing Study - Seeking Analytic Input."
*Abstract*
The recently launched Global Flourishing Study
<https://hfh.fas.harvard.edu/files/pik/files/globalflourishingstudy_report.p…>
is
a longitudinal research study being carried out in collaboration between
scholars at the Human Flourishing Program <https://hfh.fas.harvard.edu/> at
Harvard's Institute for Quantitative Social Science, Baylor’s Institute for
Studies of Religion, Gallup, and the Center for Open Science.
The study will involve data collection for approximately 240,000
participants, from 22 geographically and culturally diverse countries, with
nationally representative samples within each country, and with annual data
collection on the same panel of individuals for five waves of data. The
survey includes a rich set of questions on well-being along with
demographic, social, economic, political, religious, personality,
childhood, community, health and character-based questions. The data will
constitute an open-access resource available to scholars throughout the
world. However, in addition to what are hoped to be diverse and
wide-ranging uses of the data, the primary research team intends to carry
out a series of coordinated parallel pre-registered analyses. The talk will
give an overview of the Global Flourishing Study itself and the flourishing
framework that motivated it, along with current analysis plans for the
coordinated pre-registered studies, with the aim of receiving critique,
suggestions, and feedback from the Applied Statistics Workshop
participants. Open questions will be put forward concerning appropriate
meta-analytic summaries, confounder control with a large number of highly
correlated indicators, and challenges of missing data and attrition, all
while respecting complex survey weights, the limitations of existing
software, and the desire to allow the utilization of multiple software
packages given the size and diversity of the primary research team.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, February 9 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at *12:10 pm (EST) Wednesday,
February 2*, where William La Cava <http://williamlacava.com> (Harvard
University) presents "Unfairness in AI-based Clinical Decisions:
Intersectional Approaches to Measurement and Mitigation."
*Abstract*
Clinical decision support systems increasingly rely on machine learning
(ML) models to recommend courses of action. As a result, these systems have
the potential to exacerbate inequities in healthcare allocation and
disadvantage historically and contemporarily marginalized groups. To
address this risk, fair ML algorithms have been proposed that minimize
differences in model performance among patient groups. I will discuss some
of these methods and the challenges to implementing them in practice. Two
major challenges are to measure and mitigate these differences when we
consider grouping patients by intersections of demographic variables such
as age, race, ethnicity, sex, and socio-economic status.
*Where:* CGIS Knafel Building, Room K354
(See this link <https://map.harvard.edu/?bld=04471&level=9> for directions).
*When:* Wednesday, February 2 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 *11:30 - 11:45 am*, for
the participants who responded to our previous survey. The CGIS cafe on the
first floor has been designated as an eating area, and participants may
also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm
for the presentations.)
*Zoom link*:
https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
*Schedule of the workshop*: https://projects.iq.harvard.edu/applied
.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Welcome back to another semester of the Applied Statistics Workshop. Our
first meeting of the Spring semester will be at *12:10 pm (EST) Wednesday,
January 26, via Zoom* (link
<https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09>).
Please note that the first meeting will be *entirely on Zoom*, and from the
second meeting (Feb 2), we will continue to meet every Wednesday from 12:10
- 1:30 pm throughout the semester in CGIS Knafel, Room K354.
The schedule for the semester is below:
Jan 26 [Zoom only]: Tracy Ke (Harvard University)
Feb 2: William La Cava (Harvard University)
Feb 9: Tyler VanderWeele (Harvard University)
Feb 16: Edward McFowland III (Harvard University)
Feb 23: Soroush Saghafian (Harvard University)
Mar 2: Sharad Goel (Harvard University)
Mar 9: Iavor Bojinov (Harvard University)
Mar 16: No meeting (Spring recess)
Mar 23: José R. Zubizarreta (Harvard University)
Mar 30 [Zoom only]: Hannah Druckenmiller (Resources for the Future)
Apr 6: Adeline Lo (University of Wisconsin-Madison; Visiting faculty
affiliate at Harvard University)
Apr 13: Deirdre Bloome (Harvard University)
Apr 20: Soubhik Barari (Harvard University)
Apr 27 [Zoom only]: Tasha Fairfield (London School of Economics and
Political Science)
The title and abstract of the first presentation will be shared soon.
I look forward to seeing you all virtually on Wednesday!
Best,
Sooahn
Dear Applied Statistics Workshop Community,
Our next meeting of the semester will be at 12:10 pm (EST) Wednesday, December 1, where Susan Murphy<http://people.seas.harvard.edu/~samurphy/> (Harvard University) presents "Assessing Personalization in Digital Health."
Abstract
Reinforcement Learning provides an attractive suite of online learning methods for personalizing interventions in a Digital Health. However after an reinforcement learning algorithm has been run in a clinical study, how do we assess whether personalization occurred? We might find users for whom it appears that the algorithm has indeed learned in which contexts the user is more responsive to a particular intervention. But could this have happened completely by chance? We discuss some first approaches to addressing these questions.
Where: CGIS Knafel Building, Room K354
(See this link<https://map.harvard.edu/?bld=04471&level=9> for directions).
When: Wednesday, December 1 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 11:30 - 11:45 am, for the participants who responded to our previous survey. The CGIS cafe on the first floor has been designated as an eating area, and participants may also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm for the presentations.)
Zoom link: https://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)
Schedule of the workshop: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Looking forward to seeing you all on Wednesday!
Best,
Sooahn