Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Trees vs Neurons: Comparison between Denoising Autoencoders and Random Forest for Imputation of Mixed Data from Electronic Medical Records

Abstract

Missing data is a significant challenge impacting almost all studies; however, this is especially true for analyses of electronic health record (EHR). We propose a multiple imputation model based on multi-layer denoising autoencoders. This nonparametric model can deal with mixed-typed data types, and not making assumptions of missing mechanism. Evaluation on simulated datasets based on real life EHR datasets showed that our proposed model outperforms current Random Forest method and median/mode Imputation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View