Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Using sparse CCA for vocabulary selection

Abstract

A content-based autotagging system is a computer system that automatically annotates multimedia data such as music, images, and video with tags (semantically meaningful text- based tokens) based solely on the multimedia content. When developing an autotagging system, three important design decisions are 1) selecting a vocabulary of tags, 2) choosing a feature-based representation of the multimedia content, and 3) picking a supervised learning framework. If we select a tag that cannot be consistently used to annotate multimedia data based on the multimedia content alone (e.g., inconsistent human annotation), or if the feature representation does not encode the information necessary to annotate the multimedia content, then it is unlikely that the supervised learning framework will be able to successfully annotate novel multimedia content with that tag. This paper proposes an approach to select a vocabulary of tags based on sparse canonical component analysis (sparse CCA). That is, sparse CCA is used to find a set of "acoustically meaningful" tags that are correlated with a chosen feature-based representation of multimedia content. As a result, we find that we are better able to model the selected tags using our supervised autotagging system. In this paper, we specifically focus on music since we are interested in building a content-based music annotation system.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View