Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Probabilistic Methods for the Inference of Selection and Demography from Ancient Human Genomes

Abstract

Recently developed technologies for the recovery and sequencing of ancient DNA have generated an explosion of paleogenomic data in the last five years. In particular, human paleogenomics has become a thriving field for understanding evolutionary patterns of different hominin groups over time. However, there is still a dearth of statistical tools that can allow biologists to discern meaningful patterns from ancient genomes. Here, I present three methods designed for inferring past demographic processes and detecting loci under selection using ancient and modern hominin genomes. First, I describe an algorithm to co-estimate the contamination rate, sequencing error rate and demographic parameters - including drift times and admixture rates - for an ancient nuclear genome obtained from human remains, when the putative contaminating DNA comes from present-day humans. The method is implemented in a C++ program called `Demographic Inference with Contamination and Error' (DICE). Then, I present two methods for downstream analyses of paleogenomic samples, specifically tailored for detecting different types of positive selection. The first of these consists in a series of summary statistics for detecting adaptive introgression (AI). In particular, the number and allelic frequencies of sites that are uniquely shared between archaic humans and specific present-day populations are particularly useful for detecting adaptive pressures on introgressed haplotypes. The second approach for detecting selection is a composite likelihood ratio method called `3P-CLR', and is aimed at locating regions of the genome that were subject to selection before two populations split from each other. I use this method to look for regions under positive selection in the ancestral modern human population after its split from Neanderthals. I validate all of the above methods using simulations and real data, including present-day human genomes from the 1000 Genomes Project and several high- and low-coverage ancient genomes from archaic and early modern humans. I also recover potentially interesting candidate loci that may have been important for various phenotypic adaptations during recent human evolution.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View