Householder Symposium

PDF

$\newcommand{\footnotename}{footnote}$ $\def \LWRfootnote {1}$ $\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}$ $\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}$ $\let \LWRorighspace \hspace $ $\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }$ $\newcommand {\mathnormal }[1]{{#1}}$ $\newcommand \ensuremath [1]{#1}$ $\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } $ $\newcommand {\setlength }[2]{}$ $\newcommand {\addtolength }[2]{}$ $\newcommand {\setcounter }[2]{}$ $\newcommand {\addtocounter }[2]{}$ $\newcommand {\arabic }[1]{}$ $\newcommand {\number }[1]{}$ $\newcommand {\noalign }[1]{\text {#1}\notag \\}$ $\newcommand {\cline }[1]{}$ $\newcommand {\directlua }[1]{\text {(directlua)}}$ $\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}$ $\newcommand {\protect }{}$ $\def \LWRabsorbnumber #1 {}$ $\def \LWRabsorbquotenumber "#1 {}$ $\newcommand {\LWRabsorboption }[1][]{}$ $\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }$ $\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }$ $\def \mathcode #1={\mathchar }$ $\let \delcode \mathcode $ $\let \delimiter \mathchar $ $\def \oe {\unicode {x0153}}$ $\def \OE {\unicode {x0152}}$ $\def \ae {\unicode {x00E6}}$ $\def \AE {\unicode {x00C6}}$ $\def \aa {\unicode {x00E5}}$ $\def \AA {\unicode {x00C5}}$ $\def \o {\unicode {x00F8}}$ $\def \O {\unicode {x00D8}}$ $\def \l {\unicode {x0142}}$ $\def \L {\unicode {x0141}}$ $\def \ss {\unicode {x00DF}}$ $\def \SS {\unicode {x1E9E}}$ $\def \dag {\unicode {x2020}}$ $\def \ddag {\unicode {x2021}}$ $\def \P {\unicode {x00B6}}$ $\def \copyright {\unicode {x00A9}}$ $\def \pounds {\unicode {x00A3}}$ $\let \LWRref \ref $ $\renewcommand {\ref }{\ifstar \LWRref \LWRref }$ $ \newcommand {\multicolumn }[3]{#3}$ $\require {textcomp}$ $\newcommand {\intertext }[1]{\text {#1}\notag \\}$ $\let \Hat \hat $ $\let \Check \check $ $\let \Tilde \tilde $ $\let \Acute \acute $ $\let \Grave \grave $ $\let \Dot \dot $ $\let \Ddot \ddot $ $\let \Breve \breve $ $\let \Bar \bar $ $\let \Vec \vec $ $\newcommand {\pmat }[1]{\begin {pmatrix}#1\end {pmatrix}}$ $\newcommand {\bmat }[1]{\begin {bmatrix}#1\end {bmatrix}}$ $\newcommand {\norm }[1]{\|#1\|}$ $\newcommand {\T }{^T\!}$ $\newcommand {\xstar }{x^{\star }}$ $\newcommand {\st }{\text {s.t.}}$ $\newcommand {\R }{\mathbb {R}}$

MinAres: An Iterative Solver for Symmetric Linear Systems

Alexis Montoison, Dominique Orban, Michael Saunders

Abstract

1 MinAres

Suppose $A \in \R ^{n \times n}$ is a large symmetric matrix for which matrix-vector products $Av$ can be computed eﬀiciently for any vector $v \in \R ^n$. We present a Krylov subspace method called MinAres for computing a solution to the following problems:

\begin{alignat} {2} \label {eq:Ax=b} & \text {Symmetric linear systems:} && \quad Ax = b, \\ \label {eq:ls} & \text {Symmetric least-squares problems:} && \quad \min \norm {Ax - b}, \\ \label {eq:nullvector} & \text {Symmetric nullspace problems:} && \quad Ar = 0, \\ \label {eq:eigenvector} & \text {Symmetric eigenvalue problems:} && \quad Ar = \lambda r, \\[-4pt] \label {eq:singularvector} & \text {Singular value problems for rectangular $B$:} && \quad \bmat {& B \\ B\T & } \bmat {u \\ v} = \sigma \bmat {u \\ v}. \end{alignat} If $A$ is nonsingular, problems (1)–(2) have a unique solution $\xstar $. When $A$ is singular, if $b$ is not in the range of $A$ then (1) has no solution; otherwise, (1)–(2) have an infinite number of solutions, and we seek the unique $\xstar $ that solves the problem

\begin{equation} \label {eq:AAx=Ab} \min \tfrac {1}{2} \|x\|^2 \quad \st \quad A^2 x = Ab. \end{equation}

Let $x_k$ be an approximation to $\xstar $ with residual $r_k = b - Ax_k$. If $A$ were unsymmetric or rectangular, applicable solvers for (1)–(2) would be Lsqr [10] and Lsmr [3], which reduce $\norm {r_k}$ and $\norm {A\T r_k}$ respectively within the $k$th Krylov subspace $\mathcal {K}_k(A\T A, A^T b)$ generated by the Golub-Kahan bidiagonalization on $(A,b)$ [4].

For (1)–(5), our algorithm MinAres solves (6) by reducing $\norm {Ar_k}$ within the $k$th Krylov subspace $\mathcal {K}_k(A,b)$ generated by the symmetric Lanczos process on $(A,b)$ [6]. Thus when $A$ is symmetric, MinAres minimizes the same quantity $\norm {Ar_k}$ as Lsmr, but in different (more effective) subspaces, and it requires only one matrix-vector product $Av$ per iteration, whereas Lsmr would need two.

Qualitatively, certain residual norms decrease smoothly for these iterative methods, but other norms are more erratic as they approach zero. It is ideal if stopping criteria involve the smooth quantities. For Lsqr and Lsmr on general (possibly rectangular) systems, $\norm {r_k}$ decreases smoothly for both methods. We observe that while Lsqr is always ahead by construction, it is never by very much. Thus on consistent systems $Ax=b$, Lsqr may terminate slightly sooner than Lsmr. On inconsistent systems $Ax \approx b$, the comparison is more striking. $\norm {A\T r_k}$ decreases erratically for Lsqr but smoothly for Lsmr, and there is usually a significance difference between the two. Thus Lsmr may terminate significantly sooner [3].

Similarly for Minres [9] and MinAres, $\norm {r_k}$ decreases smoothly for both methods, and on consistent symmetric systems $Ax=b$, Minres may have a small advantage. On inconsistent symmetric systems $Ax \approx b$, $\norm {Ar_k}$ decreases erratically for Minres and its variant Minres-qlp [2] but smoothly for MinAres, and there is usually a significant difference between them. Thus MinAres may terminate sooner.

MinAres completes the family of Krylov methods based on the symmetric Lanczos process. As it minimizes $\norm {A r_k}$ (which always converges to zero), MinAres can be applied safely to any symmetric system.

On consistent symmetric systems, MinAres is a relevant alternative to Minres and Minres-qlp because it converges in a similar number of iterations if the stopping condition is based on $\norm {r_k}$, and much sooner if the stopping condition is based on $\norm {A r_k}$. On singular inconsistent symmetric systems, MinAres outperforms Minres-qlp and Lsmr, and should be the preferred method. Furthermore, a lifting step [7] can be applied to move from the final iterate to the minimum-length solution (pseudoinverse) at negligible cost.

2 CAr

We introduce CAr, a new conjugate direction method similar to Cg and Cr (the conjugate gradient and conjugate residual methods of Hestenes and Stiefel [5, 11] for solving symmetric positive definite (SPD) systems $Ax=b$). Each of these methods generates a sequence of approximate solutions $x_k$ in the Krylov subspaces $\mathcal {K}_k(A, b)$ by minimizing a quadratic function $f(x)$:

\begin{alignat*} {3} &f_{\CGM }(x) = \tfrac {1}{2} x\T A x - b\T x, \quad \quad &f_{\CR }(x) = \tfrac {1}{2} \norm {Ax - b}^2, \quad \quad &f_{\CAR }(x) = \tfrac {1}{2} \norm {A^{2\!} x - Ab}^2. \end{alignat*} CAr is to MinAres as Cr is to Minres. For SPD $A$, CAr is mathematically equivalent to MinAres, and both methods exhibit monotonic decrease in $\norm {Ar_k}$, $\norm {r_k}$, $\norm {x_k - \xstar }$, and $\norm {x_k - \xstar }_A$. The name CAr reflects its property of generating successive $A$-residuals that are conjugate with respect to $A$. Designed to minimize $\norm {Ar_k}$ in $\mathcal {K}_k(A, b)$, CAr complements the family of conjugate direction methods Cg and Cr for SPD systems.

Algorithm 1: Cg

Require: $A$, $b$, $\epsilon > 0$
$k=0$, $x_0 = 0$
$r_0 = b$, $p_0 = r_0$
$q_0 = Ap_0$
$\rho _0 = r_0\T r_0$
while $\norm {r_k} > \epsilon $ do
$\alpha _k = \rho _k / p_k\T q_k$
$x_{k+1} = x_{k} + \alpha _k p_k$
$r_{k+1} = r_{k} - \alpha _k q_k$
$\rho _{k+1} = r_{k+1}\T r_{k+1}$
$\beta _k = \rho _{k+1} / \rho _{k}$
$p_{k+1} = r_{k+1} + \beta _{k} p_k$
$q_{k+1} = A p_{k+1}$
$k \leftarrow k + 1$
end while

Algorithm 2: Cr

Require: $A$, $b$, $\epsilon > 0$
$k=0$, $x_0 = 0$
$r_0 = b$, $p_0 = r_0$
$s_0 = A r_0$, $q_0 = s_0$
$\rho _0 = r_0\T s_0$
while $\norm {r_k} > \epsilon $ do
$\alpha _k = \rho _k / \norm {q_k}^2$
$x_{k+1} = x_{k} + \alpha _k p_k$
$r_{k+1} = r_{k} - \alpha _k q_k$
$s_{k+1} = A r_{k+1}$
$\rho _{k+1} = r_{k+1}\T s_{k+1}$
$\beta _k = \rho _{k+1} / \rho _{k}$
$p_{k+1} = r_{k+1} + \beta _{k} p_k$
$q_{k+1} = s_{k+1} + \beta _{k} q_k$
$k \leftarrow k + 1$
end while

Algorithm 3: CAr

Require: $A$, $b$, $\epsilon > 0$
$k=0$, $x_0 = 0$
$r_0 = b$, $p_0 = r_0$
$s_0 = A r_0$, $q_0 = s_0$
$t_0 = A s_0$, $u_0 = t_0$
$\rho _0 = s_0\T t_0$
while $\norm {r_k} > \epsilon $ do
$\alpha _k = \rho _k / \norm {u_k}^2$
$x_{k+1} = x_{k} + \alpha _k p_k$
$r_{k+1} = r_{k} - \alpha _k q_k$
$s_{k+1} = s_{k} - \alpha _k u_k$
$t_{k+1} = A s_{k+1}$
$\rho _{k+1} = s_{k+1}\T t_{k+1}$
$\beta _k = \rho _{k+1} / \rho _{k}$
$p_{k+1} = r_{k+1} + \beta _{k} p_k$
$q_{k+1} = s_{k+1} + \beta _{k} q_k$
$u_{k+1} = t_{k+1} + \beta _{k} u_k$
$k \leftarrow k + 1$
end while

3 Krylov.jl

The algorithms MinAres and CAr have been implemented in Julia [1] as part of the package Krylov.jl [8], which provides a suite of Krylov and block-Krylov methods. Leveraging Julia’s flexibility and multiple dispatch capabilities, our implementations are compatible with all floating-point systems supported by the language, including complex numbers. These methods are optimized for both CPU and GPU architectures, ensuring high performance across a wide range of computational platforms. Additionally, our implementations support preconditioners, enhancing convergence and robustness across various problem classes.

References

[1] J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah. Julia: A fresh approach to numerical computing. SIAM Rev., 59(1):65–98, 2017.
[2] S.-C. Choi, C. C. Paige, and M. A. Saunders. MINRES-QLP: A Krylov subspace method for indefinite or singular symmetric systems. SIAM J. Sci. Comput., 33(4):1810–1836, 2011.
[3] D. C.-L. Fong and M. A. Saunders. LSMR: An iterative algorithm for sparse least-squares problems. SIAM J. Sci. Comput., 33(5):2950–2971, 2011.
[4] G. H. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix. SIAM J. Numer. Anal., 2(2):205–224, 1965.
[5] M. R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand., 49(6):409–436, 1952.
[6] C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand., 45:225–280, 1950.
[7] Y. Liu, A. Milzarek, and F. Roosta. Obtaining pseudo-inverse solutions with MINRES. arXiv preprint arXiv:2309.17096, 2023.
[8] A. Montoison and D. Orban. Krylov.jl: A Julia basket of hand-picked Krylov methods. Journal of Open Source Software, 8(89):5187, 2023.
[9] C. C. Paige and M. A. Saunders. Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal., 12(4):617–629, 1975.
[10] C. C. Paige and M. A. Saunders. LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Software, 8(1):43–71, 1982.
[11] E. Stiefel. Relaxationsmethoden bester strategie zur lösung linearer gleichungssysteme. Commentarii Mathematici Helvetici, 29(1):157–179, 1955.