GMRES with Preconditioning, Weighted norm and Deflation
Nicole Spillane, Daniel B. Szyld
Abstract
We consider the general problem of solving a linear system of the form
\[ \bA \bx = \rhs ; \, \bA \in \mathbb C^{n \times n}; \, \rhs \in \mathbb C^n. \]
The matrices \(\bA \) that we consider are non-singular, sparse and of high order \(n\). For solving these matrices, GMRES [3, Chapter 6] is a natural choice. We address two fundamental and connected questions: How can the convergence of GMRES be predicted ? How can the convergence of GMRES be accelerated ? Our aim is to combine three ways of accelerating GMRES convergence:
-
• Weighting by a Hermitian positive definite (hpd) matrix \(\bW \): all inner products and norms in the GMRES algorithm are replaced by the ones induced by \(\bW \) (see [1]),
-
• Preconditioning by a non-singular matrix \(\bH \): GMRES is applied to the preconditioned problem \(\bA \bH \bu = \rhs \) with \(\bx = \bH \bu \) (see [3, Section 9.3]),
-
• Deflation by a projection operator \(\bPi := \matid - \bA \bZ (\bY ^* \bA \bZ )^{-1} \bY ^* \) (with \(\bY , \bZ \in \mathbb C^{n \times m}\)): GMRES is applied to the projected problem \(\bPi \bA \bH \bu = \bPi \rhs \) (see [7, 4]). A suitable initialization is also performed that accounts for the part of the solution that has been projected away.
We refer to \(\bW \), \(\bH \) and \(\bPi \) as accelerators for GMRES. With words, the strategy is that the preconditioner \(\bH \) should be a good approximation of \(\bA ^{-1}\), the deflation operator should handle the space where \(\bH \) does not well approximate \(\bA ^{-1}\), and the weighted inner product should facilitate the analysis. In practice, identifying efficient accelerators requires a GMRES convergence bound where the influence of \(\bH \), \(\bW \) and \(\bPi \) is explicit. We prove in [6, Theorem 4.1] that the convergence rate is bounded by
\[ \frac { \|\br _{i+1} \|_\bW ^2}{\|\br _{i} \|_\bW ^2} \leq 1 - \operatorname {inf}\limits _{\by \in \operatorname {range} (\bPi )\setminus \{\mathbf 0\}}\frac {|\langle {\bPi \bA \bH \by }, \by \rangle _\bW |^2}{ \| \bPi \bA \bH \by \|_\bW ^2 \|\by \|_\bW ^2} \cdot \]
Further Assumptions Major simplifications occur in the case where \(\bA \) is positive definite (i.e., its Hermitian part is hpd), the preconditioner \(\bH \) is hpd, and the weight equals the preconditioner \(\bW = \bH \). In this case (and with a technical assumption on the deflation operator), it holds that
\[ \frac { \|\br _{i+1} \|^2_\bH }{\|\br _{i} \|^2_\bH } \leq 1- \operatorname {inf}\limits _{\by \in \operatorname {range} (\bA \bH \bPi )\setminus \{\mathbf 0\}} \frac {|\langle \bA ^{-1} \by , \by \rangle |}{ \langle \by , {\bM ^{-1}} \by \rangle } \times \frac {{\lambda _{\text {min}}(\bH \bM )}}{{\lambda _{\text {max}}(\bH \bM )}}, \]
where \(\bM = 1/2 (\bA + \bA ^*)\) and \(\bN = 1/2 (\bA - \bA ^*)\) are the Hermitian and skew-Hermitian parts of \(\bA \), and the spectrum of \(\bH \bM \) is in the interval \([ \lambda _{\text {min}}(\bH \bM ), \lambda _{\text {max}}(\bH \bM )]\).
Convergence without deflation Setting \(\bPi = \matid \) (no deflation) and with an identity from [2] it is proved in [5, Theorem 4.3] that
\[ \frac { \|\br _{i+1} \|^2_\bH }{\|\br _{i} \|^2_\bH } \leq 1- \frac {1}{1+{\rho ( {\bM ^{-1}} \bN )}^2 } \times \frac {{\lambda _{\text {min}}(\bH \bM )}}{{\lambda _{\text {max}}(\bH \bM )}} \]
where \(\rho (\cdot )\) denotes the spectral radius of a matrix. The residuals are bounded with respect to two quantities. The first is the condition number of \(\bH \bM \), a measure of whether \(\bH \) is a good preconditioner for the hpd matrix \(\bM \). The second is the spectral radius of \(\bM ^{-1} \bN \), a measure of how non-Hermitian the problem is. The takeaway is that fast convergence is guaranteed if the problem is mildly non-Hermitian and \(\bH \) is a good preconditioner for \(\bM \). The bound also has important consequences for parallel computing and the analysis of domain decomposition methods.
A new deflation space [6, Theorem 6.3]
When the problem is significantly non-Hermitian (in terms of \(\rho ( {\bM ^{-1}} \bN )\)), we propose to combine Hermitian preconditioning with spectral deflation.
Under the same assumptions as above, we choose the matrices \(\bZ \) and \(\bY \) in the characterization of the projection operator \(\bPi \) as follows. First, we denote by \((\lambda _j, \bz ^{(j)}) \in \mathrm {i} \mathbb R \times \mathbb C^n\) (for \(j = 1, \dots , n\)) the eigenpairs of the generalized eigenvalue problem \(\displaystyle {\bN } \bz ^{(j)} = \lambda _{j}\, {\bM }\bz ^{(j)}. \) Then, with a chosen threshold \(\tau > 0\) we select for the deflation operator, the highest frequency eigenvectors, by setting
\[ \operatorname {span}( \bZ ) := \operatorname {span} \{ \bz ^{(j)} ; |\lambda _{j} | > {\tau } \} \text { and } \bY = \bH \bA \bZ . \]
Then the convergence of weighted, preconditioned and deflated GMRES is bounded by
\[ \frac {\| \br _{i+1} \|_\bH ^2}{\| \br _{i} \|_\bH ^2} \leq 1- \frac {1}{ (1+{\tau }^2)} \times \frac {{\lambda _{\text {min}}(\bH \bM )}}{{\lambda _{\text {max}}(\bH \bM )}}. \]
Numerical illustrations show that preconditioning the Hermitian part in a way that is scalable leads to overall scalability and that spectral deflation accelerates convergence when the problems become more strongly non-Hermitian.
References
-
[1] A. Essai. Weighted FOM and GMRES for solving nonsymmetric linear systems. Numer. Algorithms, 18(3-4):277–292, 1998.
-
[2] C. R. Johnson. Inequalities for a complex matrix whose real part is positive definite. Trans. Am. Math. Soc., 212:149–154, 1975.
-
[3] Y. Saad. Iterative methods for sparse linear systems. Philadelphia, PA: SIAM Society for Industrial and Applied Mathematics, 2nd ed. edition, 2003.
-
[4] K. M. Soodhalter, E. de Sturler, and M. E. Kilmer. A survey of subspace recycling iterative methods. GAMM-Mitt., 43(4):29, 2020. Id/No e202000016.
-
[5] N. Spillane. Hermitian preconditioning for a class of non-Hermitian linear systems. SIAM J. Sci. Comput., 46(3):a1903–a1922, 2024.
-
[6] N. Spillane and D. B. Szyld. New convergence analysis of GMRES with weighted norms, preconditioning, and deflation, leading to a new deflation space. SIAM J. Matrix Anal. Appl., 45(4):1721–1745, 2024.
-
[7] J. M. Tang, R. Nabben, C. Vuik, and Y. A. Erlangga. Comparison of two-level preconditioners derived from deflation, domain decomposition and multigrid methods. J. Sci. Comput., 39:340–370, 2009.