Hi everyone!

This week at the Applied Statistics Workshop we will be welcoming Rebecca Betensky, Professor of Biostatistics at the Harvard School of Public Health. She will be presenting work entitled Nonidentifiability in the presence of factorization for truncated data. Please find the abstract below and on the Applied Stats website here.

As usual, we will meet at noon in CGIS Knafel Room 354 and lunch will be provided. See you all there!

-- Dana Higgins

Title: Nonidentifiability in the presence of factorization for truncated data

Abstract: Truncation is a structured form of selection bias that arises often in cohort studies. A time to event, X, is left truncated by T if X can be observed only if T < X. This often results in over sampling of large values of X, and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set based estimators to accommodate left truncation as long as T and X are “quasi-independent,” i.e., independent in the observable region. Through examination of the likelihood function, we derive a weaker factorization condition for the conditional distribution of T given X in the observable region that likewise permits risk-set adjustment for estimation of the distribution of X (but not T). Quasi-independence results when the analogous factorization condition for X given T holds, as well, in which case both distributions of X and T are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether both (i.e., quasi-independence) hold. Importantly, this means that we must ultimately make an unidentifiable assumption in order to estimate the distribution of X based on truncated data. This contrasts with common understanding that truncation is distinct from censoring in that it does not require any unidentifiable assumptions. We illustrate these concepts through examples and a simulation study.