PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand {\bm }[1]{\boldsymbol {#1}}\) \(\newcommand {\mathlarger }[1]{#1}\) \(\newcommand {\mathsmaller }[1]{#1}\) \(\newcommand {\LWRsubmultirow }[2][]{#2}\) \(\newcommand {\LWRmultirow }[2][]{\LWRsubmultirow }\) \(\newcommand {\multirow }[2][]{\LWRmultirow }\) \(\newcommand {\mrowcell }{}\) \(\newcommand {\mcolrowcell }{}\) \(\newcommand {\STneed }[1]{}\) \(\require {mathtools}\) \(\newenvironment {crampedsubarray}[1]{}{}\) \(\newcommand {\smashoperator }[2][]{#2\limits }\) \(\newcommand {\SwapAboveDisplaySkip }{}\) \(\newcommand {\LaTeXunderbrace }[1]{\underbrace {#1}}\) \(\newcommand {\LaTeXoverbrace }[1]{\overbrace {#1}}\) \(\newcommand {\LWRmultlined }[1][]{\begin {multline*}}\) \(\newenvironment {multlined}[1][]{\LWRmultlined }{\end {multline*}}\) \(\let \LWRorigshoveleft \shoveleft \) \(\renewcommand {\shoveleft }[1][]{\LWRorigshoveleft }\) \(\let \LWRorigshoveright \shoveright \) \(\renewcommand {\shoveright }[1][]{\LWRorigshoveright }\) \(\newcommand {\shortintertext }[1]{\text {#1}\notag \\}\) \(\newcommand {\vcentcolon }{\mathrel {\unicode {x2236}}}\)

Mixed Precision Iterative Refinement for Linear Inverse Problems

Lucas Onisk and James G. Nagy

Abstract

We are interested in linear discrete inverse problems which involve the reconstruction of objects or signals from noisy observed data. The available linear system is given by

\begin{equation} \label {linsys} Ax + e = b, \end{equation}

where \(A \in \mathbb {R}^{m \times n}\), \(m \geq n\) is a matrix whose singular values decay without significant gap and cluster at the origin (i.e., the matrix is ill-conditioned). Discrete inverse problems can arise through the discretization of Fredholm integral equations of the first kind; see [5, 3], but can also arise in massive data streaming problems such as the training of the random feature model in machine learning [8] or limited angle imaging problems including, for example, those from medical imaging [2]. To derive a meaningful solution from the available problem, regularization is needed.

In Tikhonov regularization, the least-squares problem associated with (1) is replaced by the penalized least-squares problem

\begin{equation} \label {TikEqn} \min _{x\in \mathbb {R}^{n}} \left \{\left \|Ax-b\right \|_2^2 + \alpha ^2\left \|Lx\right \|_2^2\right \} \end{equation}

where \(\alpha >0\) is a regularization parameter that balances the sensitivity of the solution vector to the error in \(b\), as well as the closeness to the desired solution of the unavailable error-free problem. When the regularization matrix \(L \in \mathbb {R}^{s \times n}\) is chosen so that the null spaces of \(A\) and \(L\) trivially intersect then the solution of (2) may be written in closed form.

Iterative refinement (IR) has long been utilized as an iterative strategy to improve the accuracy of numerical solutions to linear systems of equations. Recent works by Higham and collaborators have considered the use of IR in conjunction with mixed precision computing in light of recent advancements in hardware capabilities; see [6, 1]. Our interests of studying IR applied to the Tikhonov problem were motivated by the work [7] which considered the solution of symmetric positive definite linear systems and least-squares problems in mixed precision which showed regularization to be a key requirement when computing low precision factorizations.

The \(k^{th}\) iterate of IR applied to the Tikhonov problem in standard form, \((A^TA + \alpha ^2I)x^{(k)}=A^Tb\), where \(L=I\) may be written recursively as

\begin{equation} \label {IR_recursive_Tik} x^{(k)} = x^{(k-1)} + \left (A^TA + \alpha ^2I\right )^{-1}A^Tr^{(k-1)} - \alpha ^2 \left (A^TA + \alpha ^2I\right )^{-1}x^{(k-1)} \end{equation}

where \(r^{(k-1)}=b-Ax^{(k-1)}\) denotes the \((k-1)^{th}\) residual. Riley in [9] and Golub in [4] note that the IR procedure in (3) is equivalent to iterated Tikhonov regularization in exact arithmetic whose \(k^{th}\) iterate is given by

\begin{equation*} \label {iterTikReg} x^{(k)} = x^{(k-1)} + \left (A^TA+\alpha ^2I\right )^{-1}A^Tr^{(k-1)} \end{equation*}

which may be interpreted as a preconditioned Landweber method, or, from a mathematical optimization point of view - a preconditioned gradient descent method with fixed step size.

To better understand the application of mixed precision IR applied to the Tikhonov problem we derive a methodology to formulate the iterates as filtered solutions by writing them as a recursive relationship between the iterates of preconditioned Landweber with a Tikhonov-type preconditioner and previous iterates. A filtered solution is of the form

\begin{equation*} \label {filteredSoln} x_{filt} = \sum _j \phi _j \frac {u_j^Tb}{\sigma _j}v_j \end{equation*}

where vectors \(u_j\) and \(v_j\) correspond to left and right singular vectors of \(A\), respectively. The \(\sigma _j\) correspond to the singular values of \(A\). An intelligent selection of the filter factors \(\phi _j\) can remove deleterious components of the approximate solution to the least-squares problem stemming from (1). By considering a filtered solution, we are able to study the effect that each level of precision utilized in IR has on (i) the quality of the approximate solution and (ii) the number of iterations the algorithm requires to terminate according to some termination criterion. We demonstrate in our numerical results that mixed precision IR on the Tikhonov problem gives comparable or superior accuracy against results computed in double precision as well as another benchmark which supports its use in modern applications that natively support mixed precision floating-point arithmetic.

References