Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Previously Published Works bannerUC San Diego

A predictive ensemble classifier for the gene expression diagnosis of ASD at ages 1 to 4 years.

Abstract

Autism Spectrum Disorder (ASD) diagnosis remains behavior-based and the median age of diagnosis is ~52 months, nearly 5 years after its first-trimester origin. Accurate and clinically-translatable early-age diagnostics do not exist due to ASD genetic and clinical heterogeneity. Here we collected clinical, diagnostic, and leukocyte RNA data from 240 ASD and typically developing (TD) toddlers (175 toddlers for training and 65 for test). To identify gene expression ASD diagnostic classifiers, we developed 42,840 models composed of 3570 gene expression feature selection sets and 12 classification methods. We found that 742 models had AUC-ROC ≥ 0.8 on both Training and Test sets. Weighted Bayesian model averaging of these 742 models yielded an ensemble classifier model with accurate performance in Training and Test gene expression datasets with ASD diagnostic classification AUC-ROC scores of 85-89% and AUC-PR scores of 84-92%. ASD toddlers with ensemble scores above and below the overall ASD ensemble mean of 0.723 (on a scale of 0 to 1) had similar diagnostic and psychometric scores, but those below this ASD ensemble mean had more prenatal risk events than TD toddlers. Ensemble model feature genes were involved in cell cycle, inflammation/immune response, transcriptional gene regulation, cytokine response, and PI3K-AKT, RAS and Wnt signaling pathways. We additionally collected targeted DNA sequencing smMIPs data on a subset of ASD risk genes from 217 of the 240 ASD and TD toddlers. This DNA sequencing found about the same percentage of SFARI Level 1 and 2 ASD risk gene mutations in TD (12 of 105) as in ASD (13 of 112) toddlers, and classification based only on the presence of mutation in these risk genes performed at a chance level of 49%. By contrast, the leukocyte ensemble gene expression classifier correctly diagnostically classified 88% of TD and ASD toddlers with ASD risk gene mutations. Our ensemble ASD gene expression classifier is diagnostically predictive and replicable across different toddler ages, races, and ethnicities; out-performs a risk gene mutation classifier; and has potential for clinical translation.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View