Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

From Sound to Meaning: Representations of Speech in Human Cortex

Abstract

This dissertation investigates the cortical representation of speech perception, using a combination of functional Magnetic Resonance Imaging (fMRI) and psychoacoustical experiments.

Previous research has shown that low-level acoustical structure, phonemes, and words are processed by distinct cortical areas. However, little is known about the relationship between these different representations. To address this problem we simultaneously mapped many different representations of speech. We recorded fMRI responses from subjects listening to over two hours of natural speech. We then examined three features spaces representing the speech sounds in terms of auditory, articulatory and semantic features. We used voxel-wise modeling for each feature space combined with a novel variance-partitioning method to assess how much response variance could be explained uniquely by each model or jointly between two or three models. Validating our approach, we found that a quarter of the brain was significantly responsive to the stories, and that our models could account for up to 45% of the explainable variance in cortex and over 60% of the explainable variance in auditory areas. We also found a hierarchical set of processing steps starting in primary auditory areas and moving along the posteroventral region of the temporal lobe that are involved in the sound to word meaning transformation.

The second part of this dissertation is a psychoacoustical investigation of the modulation power spectrum (MPS) of speech. The MPS is obtained by taking the 2-dimensional Fourier transform of the speech spectrogram. We showed that comprehension of vowels and consonants is differently affected by removal of specific spectral or temporal modulations. Supplementary consonant analysis showed differences in MPS and psychoacoustical comprehension results between three groups of consonants, separated based on the manner in which they are pronounced (fricatives, stops, and sonorants). The MPS could serve as an excellent intermediate step between lower and higher levels of speech processing, and could in future studies add nuance to our previous three cortical models of speech perception.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View