Skip to main content
eScholarship
Open Access Publications from the University of California

Contextual Diversity and the Lexical Organization of Multiword Expressions

Abstract

Corpus-based models of lexical strength have questioned the role of word frequency in lexical organization. Specifically, closer fits to lexical behavior data on single words have been obtained by measures of contextual diversity, which modifies frequency by ignoring word repetition in context, semantic diversity, which considers the semantic consistency of contextual word distribution, and socially-based semantic diversity, which encodes the communication patterns of individuals across discourses (Adelman, Brown & Quesada, 2006; Jones, Johns, & Recchia, 2012; Johns, in press). The present work aimed at determining if diversity drives lexical organization also at the level multiword units. Normative ratings of familiarity for 210 English idioms (Libben & Titone, 2008) were predicted from contextual, semantic and socially-based diversity measures computed from a 55-billion word corpus of Reddit comments. Results confirmed the superiority of diversity measures over word frequency, suggesting that multiword idiomatic phrases show similar lexical organization dynamics as single words.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View