Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Measuring and Inferring Demographics From Multi-Mode Communication Networks

Abstract

With the proliferation of electronic modes of communication (e.g., e-mails, phone calls and short messages), a group of people can form several distinct communication networks. A communication network is essentially a graph representation of who talks (or texts) to whom among a group of individuals. In this thesis, we conduct an empirical study of communication networks acorss two modern countries and focus on four questions: 1) what are the patterns of multimodal communication across countries and over time? 2) how much information can we extract regarding the roles of users? 3) can we infer the demographic properties of certain users? 4) can we predict the phone choices for a group of users? For the first question, I study the correlation between calling and texting across two countries: China and the U.S., and the evolution of the usage of the two communications over the last five years. I propose to depend on communication channels (calling and texting) and time- slices (weekday, weekend, and holiday) to study how people in China and the U.S. contact one another. This idea is inspired by the fact that different human relationships can be indicated by communications in different time slicess. For example, texting tends to indicate a friendship, while calling on a weekday morning shows a relationship between colleagues. For the second question, I examine the effect of communication channels on title prediction. I first show the similarity and differences in calling and texting between managers and ordinary employees and then propose a ranking algorithm, called HumanRank, which infers job title with 10% higher accuracy than existing methods. I discuss how the texting graph is 10% better at title prediction than the calling graph. For the third question, I explore the effect of time slices on the prediction of age group and income level on call networks by studying the correlation between calling features and the demographic homophily and then proposing a prediction algorithm to reach an accuracy of 80% for age group and 71% for income level. Moreover, I discuss how the weekday graph is 15% better than the night-weekend graph regarding the prediction accuracy. for the fourth question, I study correlations between demographics and phone preference. With the features emerging from the correlations, I devise a solution to infer a user's phone choice. Compared with existing methods, my solution reduces the error by 1/3 and related costs by half.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View