Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Comparative Pangenomics: Finding structure in the endless diversity of microbial life

Abstract

Rapid developments in genome sequencing technology have revealed tremendous genetic diversity in the bacterial kingdom. The full genetic repertoire, or pangenome, of any microbial species has been found to be far more expansive than that of any individual organism, and understanding this complexity promises insights into both fundamental and practical questions regarding the diversity of microbial life. This dissertation aims to systematize the analysis of pangenomes to enable comparisons between multiple pangenomes and build generalizable workflows for investigating complex biological phenomena not limited to individual species. First, a robust pipeline for pangenome construction is presented as the foundation of this work and used to identify constants in intraspecies genetic diversity across twelve microbial species. Second, pangenome construction is combined with machine learning to elucidate global patterns of antimicrobial resistance (AMR) and identify novel AMR-conferring gene candidates. Finally, pangenome analysis is scaled to limits of publicly available data to construct and compare the core genomes of 183 species, and integrated with phylogenetics to reconstruct the core genome of the last bacterial common ancestor and identify implications regarding the minimum genetic requirements for life. These results demonstrate how pangenomes can reveal novel facets of genetic diversity previously invisible at smaller scales, and continued development of pangenomics will be necessary to close the gap between the pace of data collection and the pace of discovery.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View