PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \)

Symmetric Pseudospectral Shattering and Fast Divide-and-Conquer for the Definite Generalized Eigenvalue Problem

James Demmel, Ioana Dumitriu, and Ryan Schneider

Abstract

Overview: We adapt the asymptotically fastest-known algorithm for diagonalizing arbitrary matrix pencils – as well as the related phenomenon of pseudospectral shattering – to the definite generalized eigenvalue problem. Put simply, we obtain significant efficiency gains by preserving and exploiting structure, in this case symmetry. In doing so, our work provides a general road map for tailoring fast diagonalization to structured problems.

Recent work in randomized numerical linear algebra produced the first sub-\(O(n^3)\) algorithms for diagonalizing any matrix \(A\) or matrix pencil \((A,B)\) [3, 5]. The key insight of this work is the phenomenon of pseudospectral shattering, where a random perturbation to a matrix, or pencil, has a regularizing effect on its (pseudo)spectrum. A result of smoothed analysis [11], shattering is characterized by a minimum eigenvalue gap and minimally well-conditioned eigenvectors. Moreover, it implies success for fast divide-and-conquer eigensolvers, which can diagonalize a perturbed matrix/pencil with essentially optimal complexity (that is, complexity equal to matrix multiplication up to log factors). The name “pseudospectral shattering” is derived from the fact that a random grid covering the \(\epsilon \)-pseudospectra of the perturbed problem separates its disjoint components, and the eigenvalues they contain, into separate grid boxes for \(\epsilon \) sufficiently small – i.e., inverse polynomial in \(n\).

Pseudospectral shattering suggests a simple, high-level approach to eigenvalue problems: apply a random perturbation and run a fast version of divide-and-conquer, where the shattering grid can be used to reliably divide the spectrum at each step. The result is an accurate diagonalization, in the backward-error sense, provided the initial perturbation is small. Prior to [3], which introduced pseudospectral shattering in the context of the standard eigenvalue problem, no way of leveraging divide-and-conquer’s natural parallelization to obtain fast diagonalizations of arbitrary matrices (or pencils) was known. Importantly, [5] established that this approach can be implemented without relying on matrix inversion, thereby promoting stability while also minimizing associated communication costs (following Ballard et al. [2]).

These randomized eigensolvers, which we refer to collectively as pseudospectral divide-and-conquer, are fully general. In particular, both [3] and [5] allow matrices to be arbitrary and apply Ginibre perturbations to obtain a guarantee of pseudospectral shattering. This begs the question: how can we adapt these algorithms to better handle symmetric or sparse inputs, for which dense Gaussian perturbations are structure-destroying? Going further: if we can achieve pseudospectral shattering while maintaining structure – i.e., via structured perturbations – how can we translate that into efficiency gains in divide-and-conquer?

We answer these question for the definite generalized eigenvalue problem, which corresponds to pencils \((A,B)\) in which \(A\) and \(B\) are Hermitian and the Crawford number \(\gamma (A,B)\) satisfies

\begin{equation} \gamma (A,B) = \min _{||x||_2 = 1} |x^H(A+iB)x| > 0. \end{equation}

Pencils arising in scientific computing and machine learning are often definite [7, 6]. We note two important sub-problems in particular: (1) the Hermitian eigenvalue problem, corresponding to \(B = I\), and (2) the generalized symmetric definite eigenvalue problem, in which \(B\) is positive definite.

As inputs to divide-and-conquer, definite pencils exhibit a number of properties that can be leveraged for improved efficiency. Most notably, the eigenvalues of a definite pencil \((A,B)\) are real (and in fact any \(\epsilon \)-pseudospectrum that considers only Hermitian perturbations to \(A\) and \(B\) will be constrained to the real axis for \(\epsilon \) sufficiently small). Additionally, definite pencils are regular and satisfy stronger eigenvalue/eigenvector perturbation bounds than the generic case (see e.g., [12]). Finally, where an arbitrary pencil \((A,B)\) is diagonalized by a pair of eigenvector matrices – if it is diagonalizable at all – a definite pencil can always be diagonalized by a single matrix. That is, for any definite pencil \((A,B)\) there exists invertible \(X\) such that

\begin{equation} \label {eqn: definite_diag} (X^HAX, X^HBX) = (\Lambda _A, \Lambda _B) \end{equation}

for \(\Lambda _A\) and \(\Lambda _B\) diagonal.

Motivated by these observations, we devise a version of pseudospectral divide-and-conquer that pursues efficiency by maintaining definiteness through both the initial random perturbation and the subsequent recursive procedure. The main ingredients are the following:

Combining points one and two yields a specialized version of pseudospectral divide-and-conquer that is significantly more efficient on definite inputs. Ongoing work seeks high performance implementations of both standard pseudospectral divide-and-conquer and this specialization. Accordingly, and in parallel with broader efforts to deploy randomized algorithms in numerical linear algebra [9], our work represents an important step towards bringing fast, randomized diagonalization to practice.

References