PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\mathlarger }[1]{#1}\) \(\newcommand {\mathsmaller }[1]{#1}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \)

The Akhiezer iteration for matrix functions and Sylvester equations

Cade Ballew, Thomas Trogdon, and Heather Wilber

Abstract

We consider the computation of matrix functions \(f(\mathbf A)\) when the eigenvalues of \(\mathbf A\) are known to lie on or near a collection of disjoint intervals \(\Sigma \subset \mathbb {R}\). The Akhiezer iteration is an inverse-free iterative method for this task that arises via an orthogonal polynomial expansion of \(f\) on \(\Sigma \). When \(\Sigma \) consists of two or more intervals, extensions of the Chebyshev polynomials, often called the Akhiezer polynomials, are employed. This method is an extension of the classical Chebyshev iteration and an effective implementation of the ideas of Saad [7].

The Akhiezer iteration relies on orthogonal polynomial recurrence coefficients and Cauchy integrals. Importantly, orthonormal polynomials \(\{p_j\}_{j=0}^\infty \) with respect to a weight function \(w\) satisfy a symmetric three-term recurrence

\begin{equation} \label {eq:recurr} \begin{aligned} &xp_0(x)=a_0p_0(x)+b_0p_1(x),\\ &xp_j(x)=b_{j-1}p_{j-1}(x)+a_jp_j(x)+b_jp_{j+1}(x),\quad j\geq 1, \end {aligned} \end{equation}

for some recurrence coefficients \(\{a_j\}_{j=0}^\infty ,\{b_j\}_{j=0}^\infty \) where \(b_j>0\) for all \(j\). The Cauchy integrals of these polynomials are defined by

\begin{equation*} \mathcal {C}_\Sigma [p_jw](z)=\frac {1}{2\pi \mathrm {i}}\int _\Sigma \frac {p_j(s)w(s)}{s-z}\mathrm {d} s. \end{equation*}

As a particular example, consider \(\Sigma =[a_1,b_1]\cup [a_2,b_2]\), \(b_1<a_2\). The orthonormal polynomials with respect to the weight function

\begin{equation*} w(x)=\frac {1}{\pi }\boldsymbol {1}_\Sigma (x)\frac {\sqrt {x-b_1}}{\sqrt {b_2-x}\sqrt {x-a_1}\sqrt {x-a_2}}, \end{equation*}

were constructed by Akhiezer in [1]. The construction gives an explicit formula for these polynomials in terms of Jacobi elliptic and theta functions. From this formula and derivation, formulae for their recurrence coefficients and Cauchy integrals can be derived [2]. When explicit formulae are not known, e.g., when \(\Sigma \) consists of more than two intervals, \(N\) pairs of recurrence coefficients and Cauchy integrals can be computed in \(\mathrm {O}(N)\) operations via the numerical method of [3].

Given a function \(f\) that is analytic in a region containing \(\Sigma \), let \(p_0,p_1,\ldots \) denote the orthonormal polynomials with respect to \(w\). Then, for \(x\in \Sigma \), a \(p_j\)-series expansion for \(f\) is given by

\begin{equation*} f(x)=\sum _{j=0}^\infty \alpha _jp_j(x),\quad \alpha _j=\int _\Sigma f(x)p_j(x)w(x)\mathrm {d} x. \end{equation*}

For a matrix \(\mathbf A\) with eigenvalues on or near \(\Sigma \), this extends to an iterative method for computing \(f(\mathbf A)\) by truncating the series:

\begin{equation} \label {eq:series_trunc} f(\mathbf A)=\sum _{j=0}^\infty \alpha _jp_j(\mathbf A)\approx \sum _{j=0}^k\alpha _jp_j(\mathbf A)=:\mathbf F_{k+1}. \end{equation}

The coefficients \(\{\alpha _j\}_{j=0}^\infty \) and polynomials \(\{p_j(\mathbf A)\}_{j=0}^\infty \) can be generated via Cauchy integrals and recurrence coefficients, respectively. Applying (1), the polynomials are generated as follows:

\begin{equation*} \begin{aligned} &p_0(\mathbf A)=\mathbf I,\\ &p_1(\mathbf A)=\frac {1}{b_0}(\mathbf A p_0(\mathbf A)-a_0p_0(\mathbf A)),\\ &p_{j}(\mathbf A)=\frac {1}{b_{j-1}}(\mathbf A p_{j-1}(\mathbf A)-a_{j-1}p_{j-1}(\mathbf A)-b_{j-2}p_{j-2}(\mathbf A)),\quad j\geq 2. \end {aligned} \end{equation*}

Let \(\Gamma \) be a counterclockwise oriented curve that encloses the spectrum of \(\mathbf A\) such that \(f\) is analytic in a region containing \(\Gamma \). Then,

\begin{equation*} \alpha _j=\int _{\Sigma }f(x)p_j(x)w(x)\mathrm {d} x=\int _{\Sigma }\left (\frac {1}{2\pi \mathrm {i}}\int _{\Gamma }\frac {f(z)}{z-x}\mathrm {d} z\right )p_j(x)w(x)\mathrm {d} x. \end{equation*}

Applying a quadrature rule with nodes \(\{z_\ell \}_{\ell =1}^m\) and weights \(\{w_\ell \}_{\ell =1}^m\) to the inner integral, the coefficients can be approximated via Cauchy integrals as

\begin{equation*} \alpha _j\approx \int _{\Sigma }\frac {1}{2\pi \mathrm {i}}\sum _{\ell =1}^m\frac {f(z_\ell )}{z_\ell -x}p_\ell (x)w(x)\mathrm {d} x=-\sum _{\ell =1}^mf(z_\ell )\mathcal {C}_\Sigma [p_jw](z_\ell ). \end{equation*}

Assuming that one has access to such an approximation, the truncated series (2) can be implemented as an iteration as in Algorithm 1. The resulting method has a computable and provable geometric rate of convergence that is independent of the dimension of \(\mathbf A\) and governed by the classical exterior Green’s function with pole at infinity from potential theory. We remark that once the coefficients \(\alpha _j\) are known, this algorithm is the same for all matrix functions.

Algorithm 1: Akhiezer iteration for matrix function approximation

Input: \(f\), \(\mathbf {A}\), and functions to compute recurrence coefficients \(a_k,b_k\) and \(p_k\)-series coefficients \(\alpha _k\).

Set \(\mathbf F_0=0\).

for k=0,1,… do

if k=0 then

Set \(\mathbf P_0=\mathbf I\).

else if k=1 then

Set \(\mathbf P_{1}=\frac {1}{b_0}(\mathbf A\mathbf P_0-a_0\mathbf P_0)\).

else

Set \(\mathbf P_{k}=\frac {1}{b_{k-1}}(\mathbf A\mathbf P_{k-1} - a_{k-1}\mathbf P_{k-1} - b_{k-2}\mathbf P_{k-2}).\)

end

Set \(\mathbf F_{k+1}=\mathbf F_k + \alpha _k \mathbf P_k\).

if converged then

return \(\mathbf F_{k+1}\).

end

end

A particular application pertains to the solution of Sylvester equations of the form

\begin{equation} \label {eq:sylv_eqn} \mathbf X\mathbf A-\mathbf B\mathbf X=\mathbf C, \end{equation}

where the spectra of \(\mathbf A\in \mathbb {C}^{n\times n}\) and \(\mathbf B\in \mathbb {C}^{m\times m}\) lie in known intervals. If these intervals are disjoint, the unique solution \(\mathbf X\) to (3) is the lower left block of the matrix

\begin{equation} \label {eq:mat_sign} \mathrm {sign}\begin{pmatrix}\mathbf A&\mathbf 0 \\ \mathbf C&\mathbf B\end {pmatrix}, \end{equation}

where \(\mathrm {sign}\) evaluates to \(1\) on the spectrum of \(\mathbf A\) and \(-1\) on the spectrum of \(\mathbf B\) [6].

Algorithm 1 can be directly applied to compute (4); however, its naive use requires the computation of potentially dense matrix-matrix products and blocks that are irrelevant to the approximate solution. In the case where \(\mathbf C=\mathbf U\mathbf V\) is low-rank, this can be circumvented by deriving an equivalent iteration for only the relevant block entry, writing updates in block form and compressing at each iteration.

Such an implementation is effectively \(\mathrm {O}(n^2)\) for \(\mathbf A,\mathbf B\in \mathbb {C}^{n\times n}\), as it requires only matrix-vector products and the compression of low-rank objects. In contrast, when the coefficient matrices are dense, rational methods and direct solvers will typically be \(\mathrm {O}(n^3)\). We compare timings of such an implementation with the Bartels–Stewart algorithm [4] and factored Alternating-Directional-Implicit (ADI) iterations [5] in Table 1. The lower computational complexity is reflected in these timings, as the Akhiezer iteration has a shorter runtime than competing methods when \(n\geq 500\).

.
Runtime (seconds)
\(n\) Akhiezer Factored ADI Bartels–Stewart
100 0.0639 0.0116 0.0060
500 0.2263 0.3939 0.2836
1000 0.4947 1.9799 1.8147
1500 0.8297 4.8730 6.5464
2000 1.3224 9.6079 21.3945

Table 1: Runtime for solving (3) to full precision where \(\mathbf A\in \mathbb {R}^{n\times n}\) has spectrum contained in \([2,3]\), \(\mathbf B\in \mathbb {R}^{n\times n}\) has spectrum contained in \([-1.8,-0.5]\), and \(\mathbf C\) is rank \(2\).

References