Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Random Matrix Theory in Numerical Linear Algebra

Abstract

We use techniques from random matrix theory and high-dimensional probability to shed light on several problems in numerical linear algebra. We focus on two main topics: (1) the problem of approximately computing the eigenvalues and eigenvectors of a given non-Hermitian matrix, and (2) the problem of approximating the spectral distribution and the extreme eigenvalues of a Hermitian matrix via the Lanczos algorithm.

\textbf{Diagonalization.}

We confirm a 2007 conjecture of Davies \cite{davies} that for each $\delta\in (0,1)$, every matrix $A\in \C^{n\times n}$ is at least $\delta\|A\|$-close to one whose eigenvectors have condition number at worst $c_n/\delta$, for some $c_n$ depending only on $n$. We further show that the dependence on $\delta$ cannot

be improved to $1/\delta^p$ for any constant $p<1$.

Our proof uses tools from random matrix theory to show that the pseudospectrum of $A$ can be regularized with the addition of a complex Gaussian perturbation. Along the way, we explain how a variant of a theorem of \'Sniady implies a conjecture of Sankar, Spielman and Teng on the optimal constant for smoothed analysis of condition numbers.

Next, using this idea of adding a complex Gaussian perturbation as a preprocessing step, we exhibit a randomized algorithm which given a square matrix $A\in \C^{n\times n}$ with $\|A\|\le 1$ and $\delta>0$, computes with high probability an invertible $V$ and diagonal $D$ such that

AVDV1δ

in $O(T_\MM(n)\log^2(n/\delta))$ arithmetic operations on a floating point machine with $O(\log^4(n/\delta)\log n)$ bits of precision. The computed similarity $V$ additionally satisfies $\|V\|\|V^{-1}\|\le O(n^{2.5}/\delta)$. Here $T_\MM(n)$ is the number of arithmetic operations required to multiply two $n\times n$ complex matrices numerically stably, known to satisfy $T_\MM(n)=O(n^{\omega+\eta})$ for every $\eta>0$ where $\omega$ is the exponent of matrix multiplication \cite{DDHK}. After the initial Gaussian perturbation, the remainder of the algorithm is a variant of the spectral bisection algorithm in numerical linear algebra \cite{beavers1974new}. Our running time is optimal up to polylogarithmic factors, in the sense that verifying that a given similarity diagonalizes a matrix requires at least matrix multiplication time.

\textbf{The Lanczos algorithm.} We study the Lanczos algorithm where the initial vector is sampled uniformly from $\mathbb{S}^{n-1}$. Let $A$ be an $n \times n$ Hermitian matrix. We show that when run for few iterations, the output of the algorithm on $A$ is almost deterministic. More precisely, we show that for any $ \varepsilon \in (0, 1)$ there exists $c >0$ depending only on $\varepsilon$ and a certain global property of the spectrum of $A$ (in particular, not depending on $n$) such that when Lanczos is run for at most $c \log n$ iterations, the Jacobi coefficients and the Ritz values deviate from their medians by $t$ with probability at most $\exp(-n^\varepsilon t^2)$, for $t<\Vert A \Vert$. A similar result is derived for the Ritz vectors. The proof relies on the local L{\'e}vy lemma, a tool in high-dimensional probability regarding concentration of measure for functions that are Lipschitz on a large region of the sphere, as well as on classical connections between the Lanczos algorithm and orthogonal polynomials.

Furthermore, we show that the Lanczos algorithm fails with high probability to identify outliers of the spectrum when run for at most $c' \log n$ iterations, where again $c'$ depends only on the same global property of the spectrum of $A$. Classical results imply that the bound $c' \log n$ is tight up to a constant factor.

Our techniques also yield asymptotic results: Suppose we have a sequence of Hermitian matrices $A_n \in M_n(\mathbb{C})$ whose spectral distributions converge in Kolmogorov distance with rate $O(n^{-\varepsilon})$ to a density, for some $\varepsilon > 0$. Then we show that for large enough $n$, and for $k=O(\sqrt{\log n})$, the Ritz values after $k$ iterations concentrate around the roots of the $k$th orthogonal polynomial with respect to the limiting density.

Main Content
For improved accessibility of PDF content, download the file to your device.