Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Learning Beyond the Standard Model (of Data)

Abstract

Classically, most machine learning (ML) methodology has made an innocuous modeling assumption: data drawn from both the training/test sets has been independently sampled from a pair of identical distributions with nice properties. Yet, in the situations modern ML methods must confront, deviations from this idealized setting are quickly becoming the norm–not the exception. In this thesis, we address the challenges arising in understanding the often unexpected phenomenology in these settings by developing theory in two areas of interest: transfer learning and robust learning. In particular, we focus on identifying what structural conditions/techniques are needed to permit sample-efficient learning in these new settings, in order to answer questions such as why pretraining is so effective and what the limits of learning are for extremely heavy-tailed distributions.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View