Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

Sparse and Low-rank Matrix Decomposition – Application in Finance

No data is associated with this publication.
Abstract

The field of machine learning is witnessing a rapid expansion in the literature that explores techniques and applications of sparse and low-rank matrix decompositions. Typically formulated as an optimization problem involving nuclear norm minimization, this paradigm offers computational efficiency and robust statistical recovery guarantees, contrasting with the NP-hard nature of rank-based objectives. This thesis dedicates attention to the development of new methodology (Chapter 2) and also its application to finance (Chapter 3), as described below. Chapter 1 furnishes the necessary background and conducts a comprehensive survey of the related literature.

Chapter 2 concerns dimensionality reduction methods such as principal component analysis (PCA) and factor analysis, which are central to many problems in data science. There are, however, serious and well-understood challenges to finding robust low dimensional approximations for data with significant heteroscedastic noise. This Chapter introduces a relaxed version of Minimum Trace Factor Analysis (MTFA), a convex optimization method with roots dating back to the work of Ledermann in 1940. This relaxation is particularly effective at not overfitting to heteroskedastic perturbations and addresses the commonly cited Heywood cases in factor analysis and the recently identified ``curse of ill-conditioning" for existing spectral methods. We provide theoretical guarantees on the accuracy of the resulting low rank subspace and the convergence rate of the proposed algorithm to compute that matrix. We develop a number of interesting connections to existing methods, including Hetero PCA, Lasso, and Soft-Impute, to fill an important gap in the already large literature on low rank matrix estimation. Numerical experiments benchmark our results against several recent proposals for dealing with heteroskedastic noise.

In Chapter 3, we shift focus to factor analysis applied to security returns. Traditionally, commercially successful factor analysis relies on fundamental models, despite a rich academic literature exploring statistical models. Traditional statistical approaches like PCA and maximum likelihood exhibit success but suffer from drawbacks, such as a lack of robustness and insensitivity to narrow factors. To address these limitations, we propose convex optimization methods inspired by the techniques from Chapter 2. These methods aim to decompose a security return covariance matrix into its low-rank and sparse components. The low-rank component captures broad factors affecting most securities, while the sparse component accounts for narrow factors and security-specific effects. We illustrate the efficacy of this approach by measuring the variance forecasting accuracy of a low-rank plus sparse covariance matrix estimator through simulations and an empirical analysis of global equity data, showcasing improvements over PCA-based methods.

Main Content

This item is under embargo until February 8, 2026.