Shavlovsky, Michael Borisovich

Reputation Systems and Incentives Schemes for Quality Control in Crowdsourcing

2017

Shavlovsky, Michael Borisovich
Advisor(s): de Alfaro, Luca

Abstract

Crowdsourcing combines the abilities of computers and humans to solve tasks that computers find difficult. In crowdsourcing, computers process and aggregate input that is solicited from human workers; thus, the quality of workers' input is crucial to the success of crowdsourced solutions. Performing quality control at scale is a difficult problem: workers can make mistakes, and computers alone, without human input, cannot be used to verify the solutions. We develop reputation systems and incentive schemes for quality control in the context of different crowdsourcing applications.

To have a concrete source of crowdsourced data, we built CrowdGrader, a web based peer grading tool that lets students submit and grade solutions for homework assignments. In CrowdGrader, each submission receives several student-assigned grades which are aggregated into the final grade using a novel algorithm based on a reputation system. We first overview our work and the results on peer grading obtained via Crowdgrader. Then, motivated by our experience, we propose hierarchical incentive schemes that are truthful and cheap. The incentives are truthful as the optimal worker behavior consists in providing accurate evaluations. The incentives are cheap as they leverage hierarchy so that they be effected with a small amount of supervised evaluations, and the strength of the incentive does not weaken with increasing hierarchy depth. We show that the proposed hierarchical schemes are robust: they provide incentives in heterogeneous environments where workers can have limited proficiencies, as long as there are enough proficient workers in the crowd. Interestingly, we also show that for these schemes to work, the only requisite is that workers know their place in the hierarchy in advance.

As part of our study of user work in crowdsourcing and collaborative environments, we also study the problem of authorship attribution in revisioned content such as Wikipedia, where virtually anyone can edit an article. Information about the origin of a contribution is important for building a reputation system as it can be used for assigning reputation to editors according the quality of their contribution. Since anyone can edit an article, to attribute a new revision, a robust method has to analyze all previous revisions of the article. We describe a novel authorship attribution algorithm that can scale to very large repositories of revisioned content, as we show via experimental data over the English Wikipedia.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Santa Cruz

Reputation Systems and Incentives Schemes for Quality Control in Crowdsourcing