Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Super Learner and Targeted Maximum Likelihood Estimation for Longitudinal Data Structures with Applications to Atrial Fibrillation

Abstract

This thesis discusses the Super Learner and Targeted Maximum Likelihood Estimation (TMLE) for longitudinal data structures in nonparametric statistical models. It focuses specifically on time-dependent data structures where the outcome of interest may be described as a counting process. A Super Learner for the conditional intensity of the counting process is proposed based on the minimization of squared error and negative Bernoulli loglikelihood risks. An analytic comparison of the oracle inequality for the cross validation selector of the squared error and negative Bernoulli loglikelihood risks is provided. TMLE is extended to enforce a calibration property, defined as an implicit constraint, on the Super Learner. Tradeoffs between calibration and risk minimization are explored through simulation. The final chapter discusses and implements a recently developed TMLE for the intervention-specific marginal mean in general longitudinal data structures. A modification of this general TMLE algorithm template is implemented to respect model constraints for the estimation of the cumulative event probability in survival data with time-dependent covariates and informative right-censoring. Each chapter is supplemented with practical examples of these estimators using data from the the Kaiser Permanente ATRIA-1 cohort study of adults with atrial fibrillation. The primary appendix presents a new Super Learner software implementation as a SAS (Statistical Analysis System) macro for data-adaptive machine learning. SAS macro code for the TMLEs are provided in secondary appendices.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View