The Akhiezer iteration for matrix functions and Sylvester equations
Cade Ballew, Thomas Trogdon, and Heather Wilber
Abstract
We consider the computation of matrix functions \(f(\mathbf A)\) when the eigenvalues of \(\mathbf A\) are known to lie on or near a collection of disjoint intervals \(\Sigma \subset \mathbb {R}\). The Akhiezer iteration is an inverse-free iterative method for this task that arises via an orthogonal polynomial expansion of \(f\) on \(\Sigma \). When \(\Sigma \) consists of two or more intervals, extensions of the Chebyshev polynomials, often called the Akhiezer polynomials, are employed. This method is an extension of the classical Chebyshev iteration and an effective implementation of the ideas of Saad [7].
The Akhiezer iteration relies on orthogonal polynomial recurrence coefficients and Cauchy integrals. Importantly, orthonormal polynomials \(\{p_j\}_{j=0}^\infty \) with respect to a weight function \(w\) satisfy a symmetric three-term recurrence
\(\seteqnumber{0}{}{0}\)\begin{equation} \label {eq:recurr} \begin{aligned} &xp_0(x)=a_0p_0(x)+b_0p_1(x),\\ &xp_j(x)=b_{j-1}p_{j-1}(x)+a_jp_j(x)+b_jp_{j+1}(x),\quad j\geq 1, \end {aligned} \end{equation}
for some recurrence coefficients \(\{a_j\}_{j=0}^\infty ,\{b_j\}_{j=0}^\infty \) where \(b_j>0\) for all \(j\). The Cauchy integrals of these polynomials are defined by
\(\seteqnumber{0}{}{1}\)\begin{equation*} \mathcal {C}_\Sigma [p_jw](z)=\frac {1}{2\pi \mathrm {i}}\int _\Sigma \frac {p_j(s)w(s)}{s-z}\mathrm {d} s. \end{equation*}
As a particular example, consider \(\Sigma =[a_1,b_1]\cup [a_2,b_2]\), \(b_1<a_2\). The orthonormal polynomials with respect to the weight function
\(\seteqnumber{0}{}{1}\)\begin{equation*} w(x)=\frac {1}{\pi }\boldsymbol {1}_\Sigma (x)\frac {\sqrt {x-b_1}}{\sqrt {b_2-x}\sqrt {x-a_1}\sqrt {x-a_2}}, \end{equation*}
were constructed by Akhiezer in [1]. The construction gives an explicit formula for these polynomials in terms of Jacobi elliptic and theta functions. From this formula and derivation, formulae for their recurrence coefficients and Cauchy integrals can be derived [2]. When explicit formulae are not known, e.g., when \(\Sigma \) consists of more than two intervals, \(N\) pairs of recurrence coefficients and Cauchy integrals can be computed in \(\mathrm {O}(N)\) operations via the numerical method of [3].
Given a function \(f\) that is analytic in a region containing \(\Sigma \), let \(p_0,p_1,\ldots \) denote the orthonormal polynomials with respect to \(w\). Then, for \(x\in \Sigma \), a \(p_j\)-series expansion for \(f\) is given by
\(\seteqnumber{0}{}{1}\)\begin{equation*} f(x)=\sum _{j=0}^\infty \alpha _jp_j(x),\quad \alpha _j=\int _\Sigma f(x)p_j(x)w(x)\mathrm {d} x. \end{equation*}
For a matrix \(\mathbf A\) with eigenvalues on or near \(\Sigma \), this extends to an iterative method for computing \(f(\mathbf A)\) by truncating the series:
\(\seteqnumber{0}{}{1}\)\begin{equation} \label {eq:series_trunc} f(\mathbf A)=\sum _{j=0}^\infty \alpha _jp_j(\mathbf A)\approx \sum _{j=0}^k\alpha _jp_j(\mathbf A)=:\mathbf F_{k+1}. \end{equation}
The coefficients \(\{\alpha _j\}_{j=0}^\infty \) and polynomials \(\{p_j(\mathbf A)\}_{j=0}^\infty \) can be generated via Cauchy integrals and recurrence coefficients, respectively. Applying (1), the polynomials are generated as follows:
\(\seteqnumber{0}{}{2}\)\begin{equation*} \begin{aligned} &p_0(\mathbf A)=\mathbf I,\\ &p_1(\mathbf A)=\frac {1}{b_0}(\mathbf A p_0(\mathbf A)-a_0p_0(\mathbf A)),\\ &p_{j}(\mathbf A)=\frac {1}{b_{j-1}}(\mathbf A p_{j-1}(\mathbf A)-a_{j-1}p_{j-1}(\mathbf A)-b_{j-2}p_{j-2}(\mathbf A)),\quad j\geq 2. \end {aligned} \end{equation*}
Let \(\Gamma \) be a counterclockwise oriented curve that encloses the spectrum of \(\mathbf A\) such that \(f\) is analytic in a region containing \(\Gamma \). Then,
\(\seteqnumber{0}{}{2}\)\begin{equation*} \alpha _j=\int _{\Sigma }f(x)p_j(x)w(x)\mathrm {d} x=\int _{\Sigma }\left (\frac {1}{2\pi \mathrm {i}}\int _{\Gamma }\frac {f(z)}{z-x}\mathrm {d} z\right )p_j(x)w(x)\mathrm {d} x. \end{equation*}
Applying a quadrature rule with nodes \(\{z_\ell \}_{\ell =1}^m\) and weights \(\{w_\ell \}_{\ell =1}^m\) to the inner integral, the coefficients can be approximated via Cauchy integrals as
\(\seteqnumber{0}{}{2}\)\begin{equation*} \alpha _j\approx \int _{\Sigma }\frac {1}{2\pi \mathrm {i}}\sum _{\ell =1}^m\frac {f(z_\ell )}{z_\ell -x}p_\ell (x)w(x)\mathrm {d} x=-\sum _{\ell =1}^mf(z_\ell )\mathcal {C}_\Sigma [p_jw](z_\ell ). \end{equation*}
Assuming that one has access to such an approximation, the truncated series (2) can be implemented as an iteration as in Algorithm 1. The resulting method has a computable and provable geometric rate of convergence that is independent of the dimension of \(\mathbf A\) and governed by the classical exterior Green’s function with pole at infinity from potential theory. We remark that once the coefficients \(\alpha _j\) are known, this algorithm is the same for all matrix functions.
Algorithm 1: Akhiezer iteration for matrix function approximation
Input: \(f\), \(\mathbf {A}\), and functions to compute recurrence coefficients \(a_k,b_k\) and \(p_k\)-series coefficients \(\alpha _k\).
Set \(\mathbf F_0=0\).
for k=0,1,… do
if k=0 then
Set \(\mathbf P_0=\mathbf I\).
else if k=1 then
Set \(\mathbf P_{1}=\frac {1}{b_0}(\mathbf A\mathbf P_0-a_0\mathbf P_0)\).
else
Set \(\mathbf P_{k}=\frac {1}{b_{k-1}}(\mathbf A\mathbf P_{k-1} - a_{k-1}\mathbf P_{k-1} - b_{k-2}\mathbf P_{k-2}).\)
end
Set \(\mathbf F_{k+1}=\mathbf F_k + \alpha _k \mathbf P_k\).
if converged then
return \(\mathbf F_{k+1}\).
end
end
A particular application pertains to the solution of Sylvester equations of the form
\(\seteqnumber{0}{}{2}\)\begin{equation} \label {eq:sylv_eqn} \mathbf X\mathbf A-\mathbf B\mathbf X=\mathbf C, \end{equation}
where the spectra of \(\mathbf A\in \mathbb {C}^{n\times n}\) and \(\mathbf B\in \mathbb {C}^{m\times m}\) lie in known intervals. If these intervals are disjoint, the unique solution \(\mathbf X\) to (3) is the lower left block of the matrix
\(\seteqnumber{0}{}{3}\)\begin{equation} \label {eq:mat_sign} \mathrm {sign}\begin{pmatrix}\mathbf A&\mathbf 0 \\ \mathbf C&\mathbf B\end {pmatrix}, \end{equation}
where \(\mathrm {sign}\) evaluates to \(1\) on the spectrum of \(\mathbf A\) and \(-1\) on the spectrum of \(\mathbf B\) [6].
Algorithm 1 can be directly applied to compute (4); however, its naive use requires the computation of potentially dense matrix-matrix products and blocks that are irrelevant to the approximate solution. In the case where \(\mathbf C=\mathbf U\mathbf V\) is low-rank, this can be circumvented by deriving an equivalent iteration for only the relevant block entry, writing updates in block form and compressing at each iteration.
Such an implementation is effectively \(\mathrm {O}(n^2)\) for \(\mathbf A,\mathbf B\in \mathbb {C}^{n\times n}\), as it requires only matrix-vector products and the compression of low-rank objects. In contrast, when the coefficient matrices are dense, rational methods and direct solvers will typically be \(\mathrm {O}(n^3)\). We compare timings of such an implementation with the Bartels–Stewart algorithm [4] and factored Alternating-Directional-Implicit (ADI) iterations [5] in Table 1. The lower computational complexity is reflected in these timings, as the Akhiezer iteration has a shorter runtime than competing methods when \(n\geq 500\).
Runtime (seconds) | |||
\(n\) | Akhiezer | Factored ADI | Bartels–Stewart |
100 | 0.0639 | 0.0116 | 0.0060 |
500 | 0.2263 | 0.3939 | 0.2836 |
1000 | 0.4947 | 1.9799 | 1.8147 |
1500 | 0.8297 | 4.8730 | 6.5464 |
2000 | 1.3224 | 9.6079 | 21.3945 |
Table 1: Runtime for solving (3) to full precision where \(\mathbf A\in \mathbb {R}^{n\times n}\) has spectrum contained in \([2,3]\), \(\mathbf B\in \mathbb {R}^{n\times n}\) has spectrum contained in \([-1.8,-0.5]\), and \(\mathbf C\) is rank \(2\).
References
-
[1] N. I. Akhiezer. Elements of the Theory of Elliptic Functions, volume 79 of Translations of Mathematical Monographs. American Mathematical Society, 1970.
-
[2] C. Ballew and T. Trogdon. The Akhiezer iteration. arXiv preprint 2312.02384, 2023.
-
[3] C. Ballew and T. Trogdon. A Riemann–Hilbert approach to computing the inverse spectral map for measures supported on disjoint intervals. Studies in Applied Mathematics, 152(1):31–72, 2024.
-
[4] R. H. Bartels and G. W. Stewart. Algorithm 432 [C2]: Solution of the matrix equation AX + XB = C [F4]. Commun. ACM, 15(9):820–826, September 1972.
-
[5] P. Benner, R.-C. Li, and N. Truhar. On the ADI method for Sylvester equations. Journal of Computational and Applied Mathematics, 233(4):1035–1045, 2009.
-
[6] N. J. Higham. Functions of matrices: theory and computation. SIAM, 2008.
-
[7] Y. Saad. Iterative Solution of Indefinite Symmetric Linear Systems by Methods Using Orthogonal Polynomials over Two Disjoint Intervals. SIAM Journal on Numerical Analysis, 20(4):784–811, 1983.