Dear Applied Statistics Workshop Community,
Our next meeting will be on October 4 (12:00 EST). Michael Lingzhi Li
presents "Statistical Performance Guarantee for Selecting Those Predicted
to Benefit Most from Treatment."
<When>
October 4, 12:00 to 1:30 PM, EST
Lunch will be available for pick-up inside CGIS K354.
<Where>
In-person: CGIS K354
Zoom:
https://harvard.zoom.us/j/93217566507?pwd=elBwYjRJcWhlVE5teE1VNDZoUXdjQT09
<Abstract>
Across a wide array of disciplines, many researchers use modern machine
learning algorithms to identify a subgroup of individuals, called
exceptional responders, who are likely to be helped by a treatment the
most. A common approach is to first estimate the conditional average
treatment effect (CATE) or its proxy given a set of pre-treatment
covariates and then optimize a cutoff of the resulting treatment
prioritization score to prioritize who should receive the treatment.
Unfortunately, since these estimated scores are often biased and noisy in
practice, naive reliance on them can lead to misleading inference.
Furthermore, practitioners often utilize the same set of data to optimize
the cutoff and evaluate the performance of the resulting subset, causing a
multiple testing problem. In this paper, we propose a methodology that has
a uniform statistical performance guarantee for selecting such exceptional
responders regardless of the cutoff optimization. Specifically, we develop
a uniform confidence interval for experimentally evaluating the group
average treatment effect (GATE) among the individuals whose estimated score
is at least as high as any given quantile value. This uniform confidence
interval enables researchers to utilize arbitrary methods to choose the
quantile of estimated score, including optimizing over the lower confidence
bound of the estimated GATE among the selected individuals. The proposed
methodology provides this statistical performance guarantee without
suffering from multiple testing problems, and also generalizes to a generic
class of statistics beyond GATE. Importantly, the validity of our
methodology depends solely on randomization of treatment and random
sampling of units and does not require modeling assumptions or resampling
methods. Consequently, our methodology is applicable to any machine
learning algorithm and is computationally efficient.
<2023 Schedule>
GOV 3009 Website:
https://projects.iq.harvard.edu/applied.stats.workshop-gov3009
Calendar:
https://calendar.google.com/calendar/u/0?cid=Y18zdjkzcGF2OWZqa2tsZHJidTlzbm…
Best,
Jialu
--
Jialu Li
Department of Government
Harvard University
jialu_li(a)g.harvard.edu