Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Prediction and Inference for High-Dimensional Genetic Data

Abstract

Collection of large amounts of genetic data and advancements in computational genetics over the recent years provide us with tools to explore epigenetic mechanisms that lead to aging and lifespan. In the context of continuous DNA methylation data, with a novel cross-species DNA methylation microarray targeting conserved CpG sites across mammalian species, we are able to leverage readily available statistical models to extensively study important life history traits such as lifespan, gestation time, and time to sexual maturity across various species. DNA methylation data are often high dimensional and require regularized regression frameworks to construct practical prediction models. Based on an unprecedented mammalian DNA methylation data set, we have developed methylation-based epigenetic life history traits predictors using regularized linear regressions. The estimators can accurately predict maximum lifespan using cytosine methylation patterns collected from over 13,000 samples derived from 348 mammalian species. To extend our future inferential analyses into diverse data sources such as RNA-seq data, we have proposed an L0-regularized Poisson graphical model for exploring gene-to-gene relations. The superior theoretical properties that the L0 sparse graphical model enjoys will more effectively assist the future work of clustering and grouping large numbers of DNA methylation sites and genes. Both the applied research and methodological work will aid in the aging research goals of integrating various layers of multiomics data.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View