Dear Applied Statistics Workshop Community,

Our next meeting of the semester will be at 12:10 pm (EST) Wednesday, February 23, where Soroush Saghafian (Harvard University) presents "Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach." The full paper is available here.

Abstract

A main research goal in various studies is to use an observational data set and provide a new set of counterfactual guidelines that can yield causal improvements. Dynamic Treatment Regimes (DTRs) are widely studied to formalize this process and enable researchers to find guidelines that are both personalized and dynamic. However, available methods in finding optimal DTRs often rely on assumptions that are violated in real-world applications (e.g., medical decision-making or public policy), especially when (a) the existence of unobserved confounders cannot be ignored, and (b) the unobserved confounders are time-varying (e.g., affected by previous actions). When such assumptions are violated, one often faces ambiguity regarding the underlying causal model that is needed to be assumed to obtain an optimal DTR. This ambiguity is inevitable, since the dynamics of unobserved confounders and their causal impact on the observed part of the data cannot be understood from the observed data. Motivated by a case study of finding superior treatment regimes for patients who underwent transplantation in our partner hospital and faced a medical condition known as New Onset Diabetes After Transplantation (NODAT), we propose a new framework termed Ambiguous Dynamic Treatment Regimes (ADTRs), in which the casual impact of treatment regimes is evaluated based on a “cloud” of potential causal models. We then connect ADTRs to Ambiguous Partially Observable Mark Decision Processes (APOMDPs) proposed by Saghafian (2018), and consider unobserved confounders as latent variables but with ambiguous dynamics and causal effects on observed variables. Using this connection, we develop two Reinforcement Learning methods termed Direct Augmented V-Learning (DAV-Learning) and Safe Augmented V-Learning (SAV-Learning), which enable using the observed data to efficiently learn an optimal treatment regime. We establish theoretical results for these learning methods, including (weak) consistency and asymptotic normality. We further evaluate the performance of these learning methods both in our case study (using clinical data) and in simulation experiments (using synthetic data). We find promising results for our proposed approaches, showing that they perform well even compared to an imaginary oracle who knows both the true causal model (of the data generating process) and the optimal regime under that model.


Where: CGIS Knafel Building, Room K354
(See this link for directions).

When: Wednesday, February 23 at 12:10 - 1:30 pm.
(Bagged lunches available for pick-up at CGIS K354 11:30 - 11:45 am, for the participants who responded to our previous survey. The CGIS cafe on the first floor has been designated as an eating area, and participants may also use outdoor spaces for lunch. Please be present at K354 by 12:10 pm for the presentations.)

Zoom linkhttps://harvard.zoom.us/j/97004196610?pwd=eGFydkF5RDRjUlk5RVcyTjV6OStUQT09
(For the participants who cannot join the session physically.)

Schedule of the workshophttps://projects.iq.harvard.edu/applied.stats.workshop-gov3009

Looking forward to seeing you all on Wednesday!

Best,
Sooahn