Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Characterize and classify genetic variation in chromatin state in Drosophila melanogaster

Abstract

There are two types of genetic traits. The first is monogenic traits which are caused by rare variants that disrupt the function of a single gene. Monogenic traits typically follow the classic Mendelian inheritance, and are rare in nature. In contrast, the second type of genetic traits involve heritable traits that do not follow the classic Mendelian inheritance. These traits are classified as complex traits, and are thought to involve multiple genes. As a result, many studies have spent great efforts to elucidate the nature of these complex traits. However, an appreciable fraction of heritable variation remains unexplained, and is referred to as "missing heritability". It is widely believed that these missing heritable variations are variations in gene expression due to the binding of transcription factors to enhancers. These binding events can be identified by the local chromatin configuration which should be open in particular tissue or timepoint necessary for a trait. Therefore, I argue that a genome-wide landscape of variation in chromatin accessibility in a large number of tissues would be valuable for complex trait studies.

Thus, my first chapter is to utilize ATAC-seq to assess chromatin accessibility across multiple genotypes and tissues from Drosophila melanogaster. In this first chapter, I performed ATAC-seq to study chromatin accessibility for four different tissues: adult female brain, ovaries, wing and eye-antennal imaginal discs. Each sample is also collected from eight different inbred strains. I have identified 44099 ATAC-seq peaks-regions with high ATAC-seq fragment coverage. Furthermore, since the eight inbred founder strains have reference quality genome assemblies, I also performed structural variant correction on my ATAC-seq data. These structural variants contributed to an elevated rate (55%) of the identification of false positive differences in chromatin state between genotypes. After structural variant correction, I have found 1050, 30383, and 4508 regions whose peak heights are polymorphic among genotypes, tissues, or for genotype by tissue interactions respectively. Finally, I identified 249 SNPs and 3 SVs candidate causative variants that explained 100% of the variation at nearby chromatin profiles varying among genotypes.

While having a completely characterized open chromatin landscape is helpful for complex trait communities, the question of whether those polymorphic regulatory elements are in cis or in trans remain unanswered. Thus, my second chapter aim is to elucidate the cis and trans nature of the identified regulatory elements from the first study. Therefore, I performed ATAC-seq, utilized our developed quantile normalization of ATAC-seq data,SV-correction, ANOVA-based statistical analysis, and haplotype phasing to examine chromatin accessibility and its cis, and trans nature in Drosophila melanogaster ovaries collected from two parental strains (A4, B6) and their F1 offspring. We identified 3006 ATAC-seq peaks that are significantly different between parental genotypes. Out of those ATAC-seq peaks, 106 and 45 peaks are identified to be cis and trans regulatory respectively using cis-trans value.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View