PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\bm }[1]{\boldsymbol {#1}}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand \vecv {\boldsymbol {\mathrm {v}}}\) \(\newcommand \matA {\boldsymbol {\mathrm {A}}}\) \(\newcommand \matB {\boldsymbol {\mathrm {B}}}\) \(\newcommand {\fl }{\boldsymbol {\mathsf {fl}}}\) \(\newcommand {\umach }{\textbf {\textup {u}}}\) \(\newcommand {\cmark }{Y}\) \(\newcommand {\xmark }{N}\) \(\newcommand {\flopcost }{\mathcal {F}}\) \(\newcommand {\lpar }{\left (}\) \(\newcommand {\rpar }{\right )}\) \(\DeclareMathOperator {\poly }{poly}\) \(\DeclareMathOperator {\polylog }{polylog}\)

Algorithms for Hermitian eigenproblems and their complexity

Aleksandros Sobczyk

Abstract

Hermitian eigenproblems, and, more broadly, Hermitian-definite pencils, arise naturally in many real world applications in Machine Learning, Scientific Computing, and Engineering. Given a Hermitian matrix \(\matA \) and a Hermitian positive-definite matrix \(\matB \), the goal is to compute (a subset of) the eigenvalues \(\lambda \) and/or the eigenvectors \(\vecv \), which satisfy

\begin{align*} \matA \vecv = \lambda \matB \vecv . \end{align*} In data science and machine learning, for example, they arise in Spectral Clustering [31, 39], Language Models [22], Image Processing [36, 1], Principal Components Analysis [11, 24, 37], and many others [12, 29, 14, 9, 28, 2]. A ubiquitous application in Scientific Computing is the computation of the density matrices and the electron densities in Density Functional Theory (DFT) [27].

Algorithms for eigenproblems have been studied since at least the nineteenth century, some early references being attributed to Jacobi [23, 16]. In this work we revisit algorithms from the computational complexity point of view, targeting \((i)\) eigenvalue problems, which involve the approximation of eigenvalues, singular values, spectral gaps, and condition numbers, \((ii)\) eigenspace problems,such as the approximation of eigenvectors, spectral projectors, and invariant subspaces, and \((iii)\) diagonalization problems, which refer to the computation of all the eigenvalues and eigenvectors of a matrix, for example, a full spectral factorization, or the SVD.

This work is a summary of the results in [41, 40] for some variants of the aforementioned problems, which were part of the corresponding authors’ doctoral dissertation. All of the algorithms and complexity bounds are in the following three models of computation:

Exact real arithmetic (Real RAM),

where a processor can execute the following operations: \(\{+,-,\times ,/,\sqrt \cdot ,>\},\) on real numbers, in constant time, without any rounding errors;

Rational arithmetic,

where numbers are represented as rationals consisting of an integral numerator and denominator of finite (but variable bit length, and each arithmetic operation does not introduce errors but takes time proportional to the number of bits;

Floating point,

where any real number \(\alpha \in \mathbb {R}\) is rounded to a floating point number \(\fl (\alpha ) = s\times 2^{e-t} \times m,\) and each arithmetic operation \(\odot \in \{+,-,\times ,/,\sqrt {\cdot }\}\) introduces also some errors that satisfy \(\fl (\alpha \odot \beta ) = (1+\theta )(\alpha \odot \beta ), \qquad \fl (\sqrt {a})=(1+\theta )\sqrt {a},\) where \(|\theta |\leq \umach \). Assuming a total of \(b\) bits for each number, every floating point operation costs \(\flopcost (b)\) bit operations, where typically it is assumed that \(\flopcost (b)\in \widetilde O(b)\) [19].

Algorithms in Real RAM. In the Real RAM model, we analyze the following problems:

We first provide an end-to-end complexity analysis of the divide-and-conquer algorithm of Gu and Eisenstat [18] for the diagonalization of tridiagonal and arrowhead matrices, when accelerated with the Fast Multipole Method [35]. By carefully analyzing all of the steps of the algorithm, we show that it provides provable approximation guarantees with the claimed nearly-\(O(n^2)\) arithmetic complexity, which significantly improves classic (dense) eigensovlers such as the QR algorithm.

The tridiagonal diagonalization algorithm can be efficiently combined with the (rather overlooked) tridiagonal reduction algorithm of Schönhage [38], who proved that a Hermitian matrix can be reduced to tridiagonal form with unitary similarity transformations in \(O(n^\omega )\) arithmetic operations. Here \(\omega \lesssim 2.371\) is the current best known upper bound for the matrix multiplication exponent. This way, we can diagonalize a Hermitian matrix in nearly matrix multiplication time, improving the \(O(n^3)\) arithmetic complexity of classic algorithms [30, 17], as well as the more recent \(O(n^{\omega }\log ^2(\tfrac {n}{\epsilon }))\) complexity of the randomized algorithm of [3]. Similar bounds are obtained for the deterministic complexity of the SVD. Many theoretical works assume the exact computation of an SVD as “black-box” subroutine; see e.g. [34, 15, 13, 7, 9, 8], to name a few. However, its complexity and approximation guarantees are often unspecified. Our main results and comparisons with existing algorithms are outlined in Table 1.

Table 1: Complexities for diagonalization problems in the Real RAM model. The randomized algorithms succeed with high probability (at least \(1-1/\poly (n)\)). \(O(n^{\omega (a,b,c)})\) denotes the complexity of multiplying two matrices with sizes \(n^a\times n^b\) and \(n^b\times n^c\), respectively.

.
Arithmetic Complexity Deterministic
Arrowhead/Tridiagonal diagonalization
[10, 32, 18] \(O(n^3)+\widetilde O(n^2)\) Y
Our results \(O\lpar n^2\polylog (\tfrac {n}{\epsilon })\rpar \) Y
Hermitian diagonalization
[3, 25] \(O\lpar n^{\omega }\log ^2(\tfrac {n}{\epsilon })\rpar \) N
[4] \(\widetilde O\lpar n^{\omega +1}\rpar \) N
[30] \(O\lpar n^{3}\rpar \) Y
Our results \(O\lpar n^\omega \log (n) + n^2\polylog (\tfrac {n}{\epsilon })\rpar \) Y
SVD
Fast MM + [30] \(O\lpar n^{\omega (1,k,1)}\rpar + \widetilde O \lpar n^3 \rpar \) Y
[3, 25] \(O\lpar n^{\omega (1,k,1)} + n^{\omega }\log ^2(\tfrac {n}{\epsilon }) \rpar \) N
Our results \(O\lpar n^{\omega (1,k,1)} + n^\omega \log (n)+ n^2\polylog (\tfrac {n\kappa (\matA )}{\epsilon })\rpar \) Y

Algorithms in finite precision. Similar deterministic complexity upper bounds are obtained for several problems in finite precision. We will present a stability analysis of the tridiagonal reduction algorithm of Schönhage, and its combination with the tridiagonal eigenvalue solver of [5, 6] (in rational arithmetic), to compute all the eigenvalues of a Hermitian matrix in nearly \(O(n^{\omega })\) bit operations, deterministically.

Our main results, which are summarized in Table 2, improve several existing algorithms in different aspects. Pan and Chen [33] showed how to achieve additive errors in the eigenvalues in \(O(n^\omega )\) arithmetic operations, but this increases to \(O(n^{\omega +1})\) boolean operations in rational arithmetic. The algorithm of [30] achieves \(\widetilde O(n^3)\) bit operations for all the eigenvalues. In the randomized setting, [3] also provides forward errors for the approximate eigenvalues in the Hermitian case, by exploiting a perturbation bound by Kahan [26, 41] in \(O(n^\omega \polylog (n/\epsilon ))\) boolean operations, but the logarithm power is fairly large. We also provide the analysis for several other useful subroutines related to eigenvalue computations, including singular values, condition numbers, Hermitian-definite pencil eigenvalues, spectral gaps, and spectral projectors.

Table 2: Boolean complexity comparison in finite precision. Here \(\epsilon ,\in (0,\tfrac {1}{2})\). FP stands for Floating Point and RA for Rational Arithmetic.

.
Boolean Complexity Success probability Model
Tridiagonal Reduction
[21, 20] \(O\lpar n^3 \flopcost \lpar \log (\tfrac {n}{\epsilon }) \rpar \rpar \) Deterministic FP
Our results \(O\lpar n^{\omega }\log (n) \flopcost \lpar \log (\tfrac {n}{\epsilon }) \rpar \rpar \) Deterministic FP
Hermitian Eigenvalues
[30] \(O\lpar n^{3}\flopcost \Big (\log (\tfrac {n}{\epsilon }) \Big )\rpar \) Deterministic FP
[3]+[26] \(O\lpar n^{\omega }\log ^2(\tfrac {n}{\epsilon }) \flopcost \Big ( \log ^4(\tfrac {n}{\epsilon })\log (n) \Big ) \rpar \) \(1-O(1/n)\) FP
Our results \(O\lpar n^{\omega }\flopcost \Big ( \log (\tfrac {n}{\epsilon }) \Big ) + n^2\polylog (\tfrac {n}{\epsilon }) \rpar \) Deterministic FP+RA

References