Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Scalable Inference in Bayesian Phylogenetics

Abstract

Phylogenetic models with lineage-specific parameter characterizations provide a flexible framework to model ancestral changes in diffusion and evolution processes. However, increased taxonomic sampling challenges inference under these models as the number of unknown parameters grows with the number of taxa. To solve this problem, I develop scalable inference machinery as well as scalable models to permit the study of increasingly massive trees within a Bayesian phylogenetic framework. First, I introduce a method to compute the gradient of the trait data log-likelihood of the popular relaxed random walk model of trait diffusion with computational complexity that is linear with the number of tips in the tree. I use this gradient to build an efficient Hamiltonian Monte Carlo (HMC) sampler that simultaneously samples all branch-specific model parameters with high acceptance probability. Next, I propose a new, auto-correlated molecular clock rate model together with scalable inference methods. My approach permits estimating both the presence and location of local clocks without a priori knowledge of their placement and avoids inordinately shrinking clock-rates. Finally, I develop a shrinkage-based adaptive shift model that automatically detect the number and placement of shifts in adaptive trait optima along a tree. Leveraging recent fast closed-form gradient calculations, I build an efficient HMC sampler that scales inference under this new model. I demonstrate the speed and utility of each method via a range of applications, including the study of viral evolution and phenotypic trait data.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View