PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand {\matid }{\ensuremath {\mathbf {{I}}}}\) \(\newcommand {\bPi }{\ensuremath {\boldsymbol {\Pi }}}\) \(\newcommand {\bA }{\ensuremath {\mathbf {A}}}\) \(\newcommand {\bH }{\ensuremath {\mathbf {H}}}\) \(\newcommand {\bW }{\ensuremath {\mathbf {W}}}\) \(\newcommand {\bM }{\ensuremath {\mathbf {M}}}\) \(\newcommand {\bN }{\ensuremath {\mathbf {N}}}\) \(\newcommand {\bu }{\ensuremath {\mathbf {u}}}\) \(\newcommand {\bx }{\ensuremath {\mathbf {x}}}\) \(\newcommand {\by }{\ensuremath {\mathbf {y}}}\) \(\newcommand {\bz }{\ensuremath {\mathbf {z}}}\) \(\newcommand {\br }{\ensuremath {\mathbf {r}}}\) \(\newcommand {\rhs }{\ensuremath {\mathbf {b}}}\) \(\newcommand {\bZ }{\ensuremath {\mathbf {Z}}}\) \(\newcommand {\bY }{\ensuremath {\mathbf {Y}}}\)

GMRES with Preconditioning, Weighted norm and Deflation

Nicole Spillane, Daniel B. Szyld

Abstract

We consider the general problem of solving a linear system of the form

\[ \bA \bx = \rhs ; \, \bA \in \mathbb C^{n \times n}; \, \rhs \in \mathbb C^n. \]

The matrices \(\bA \) that we consider are non-singular, sparse and of high order \(n\). For solving these matrices, GMRES [3, Chapter 6] is a natural choice. We address two fundamental and connected questions: How can the convergence of GMRES be predicted ? How can the convergence of GMRES be accelerated ? Our aim is to combine three ways of accelerating GMRES convergence:

We refer to \(\bW \), \(\bH \) and \(\bPi \) as accelerators for GMRES. With words, the strategy is that the preconditioner \(\bH \) should be a good approximation of \(\bA ^{-1}\), the deflation operator should handle the space where \(\bH \) does not well approximate \(\bA ^{-1}\), and the weighted inner product should facilitate the analysis. In practice, identifying efficient accelerators requires a GMRES convergence bound where the influence of \(\bH \), \(\bW \) and \(\bPi \) is explicit. We prove in [6, Theorem 4.1] that the convergence rate is bounded by

\[ \frac { \|\br _{i+1} \|_\bW ^2}{\|\br _{i} \|_\bW ^2} \leq 1 - \operatorname {inf}\limits _{\by \in \operatorname {range} (\bPi )\setminus \{\mathbf 0\}}\frac {|\langle {\bPi \bA \bH \by }, \by \rangle _\bW |^2}{ \| \bPi \bA \bH \by \|_\bW ^2 \|\by \|_\bW ^2} \cdot \]

Further Assumptions Major simplifications occur in the case where \(\bA \) is positive definite (i.e., its Hermitian part is hpd), the preconditioner \(\bH \) is hpd, and the weight equals the preconditioner \(\bW = \bH \). In this case (and with a technical assumption on the deflation operator), it holds that

\[ \frac { \|\br _{i+1} \|^2_\bH }{\|\br _{i} \|^2_\bH } \leq 1- \operatorname {inf}\limits _{\by \in \operatorname {range} (\bA \bH \bPi )\setminus \{\mathbf 0\}} \frac {|\langle \bA ^{-1} \by , \by \rangle |}{ \langle \by , {\bM ^{-1}} \by \rangle } \times \frac {{\lambda _{\text {min}}(\bH \bM )}}{{\lambda _{\text {max}}(\bH \bM )}}, \]

where \(\bM = 1/2 (\bA + \bA ^*)\) and \(\bN = 1/2 (\bA - \bA ^*)\) are the Hermitian and skew-Hermitian parts of \(\bA \), and the spectrum of \(\bH \bM \) is in the interval \([ \lambda _{\text {min}}(\bH \bM ), \lambda _{\text {max}}(\bH \bM )]\).

Convergence without deflation Setting \(\bPi = \matid \) (no deflation) and with an identity from [2] it is proved in [5, Theorem 4.3] that

\[ \frac { \|\br _{i+1} \|^2_\bH }{\|\br _{i} \|^2_\bH } \leq 1- \frac {1}{1+{\rho ( {\bM ^{-1}} \bN )}^2 } \times \frac {{\lambda _{\text {min}}(\bH \bM )}}{{\lambda _{\text {max}}(\bH \bM )}} \]

where \(\rho (\cdot )\) denotes the spectral radius of a matrix. The residuals are bounded with respect to two quantities. The first is the condition number of \(\bH \bM \), a measure of whether \(\bH \) is a good preconditioner for the hpd matrix \(\bM \). The second is the spectral radius of \(\bM ^{-1} \bN \), a measure of how non-Hermitian the problem is. The takeaway is that fast convergence is guaranteed if the problem is mildly non-Hermitian and \(\bH \) is a good preconditioner for \(\bM \). The bound also has important consequences for parallel computing and the analysis of domain decomposition methods.

A new deflation space [6, Theorem 6.3]

When the problem is significantly non-Hermitian (in terms of \(\rho ( {\bM ^{-1}} \bN )\)), we propose to combine Hermitian preconditioning with spectral deflation.

Under the same assumptions as above, we choose the matrices \(\bZ \) and \(\bY \) in the characterization of the projection operator \(\bPi \) as follows. First, we denote by \((\lambda _j, \bz ^{(j)}) \in \mathrm {i} \mathbb R \times \mathbb C^n\) (for \(j = 1, \dots , n\)) the eigenpairs of the generalized eigenvalue problem \(\displaystyle {\bN } \bz ^{(j)} = \lambda _{j}\, {\bM }\bz ^{(j)}. \) Then, with a chosen threshold \(\tau > 0\) we select for the deflation operator, the highest frequency eigenvectors, by setting

\[ \operatorname {span}( \bZ ) := \operatorname {span} \{ \bz ^{(j)} ; |\lambda _{j} | > {\tau } \} \text { and } \bY = \bH \bA \bZ . \]

Then the convergence of weighted, preconditioned and deflated GMRES is bounded by

\[ \frac {\| \br _{i+1} \|_\bH ^2}{\| \br _{i} \|_\bH ^2} \leq 1- \frac {1}{ (1+{\tau }^2)} \times \frac {{\lambda _{\text {min}}(\bH \bM )}}{{\lambda _{\text {max}}(\bH \bM )}}. \]

Numerical illustrations show that preconditioning the Hermitian part in a way that is scalable leads to overall scalability and that spectral deflation accelerates convergence when the problems become more strongly non-Hermitian.

References