Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Previously Published Works bannerUC Irvine

Building and benchmarking the motivated deception corpus: Improving the quality of deceptive text through gaming

Abstract

When one studies fake news or false reviews, the first step to take is to find a corpus of text samples to work with. However, most deceptive corpora suffer from an intrinsic problem: there is little incentive for the providers of the deception to put their best effort, which risks lowering the quality and realism of the deception. The corpus described in this project, the Motivated Deception Corpus, aims to rectify this problem by gamifying the process of deceptive text collection. By having subjects play the game Two Truths and a Lie, and by rewarding those subjects that successfully fool their peers, we collect samples in such a way that the process itself improves the quality of the text. We have amassed a large corpus of deceptive text that is strongly incentivized to be convincing, and thus more reflective of real deceptive text. We provide results from several configurations of neural network prediction models to establish machine learning benchmarks on the data. This new corpus is demonstratively more challenging to classify with the current state of the art than previous corpora.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View