Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Unsupervised Text Generation and its Application to News Interfaces

Abstract

Recent progress in automated text generation relies predominantly on the use of large datasets, sometimes requiring millions of examples for each application setting. In the first part of this thesis, we advance the field by developing novel text generation methods that balance the goals of fluency, consistency, and relevancy without requiring any training data. We achieve this objective on tasks such as text summarization and simplification by directly defining a multi-component reward, and training text generators to optimize this objective. The novel approaches that we introduce perform better than all existing unsupervised approaches and in many cases outperform those that rely on large datasets.

The second part of the thesis incorporates text generation into interfaces to help news readers navigate complex, unfolding news topics. We build a novel representation of news stories at scale and integrate new summarization, question generation and question answering modules into a chatbot and an automated interactive podcast. Human evaluations confirm that even though imperfect systems introduce friction for the user, they can serve as powerful tools to stimulate reader curiosity and help readers dive deeper into unfolding topics.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View