PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand {\pmat }[1]{\begin {pmatrix}#1\end {pmatrix}}\) \(\newcommand {\bmat }[1]{\begin {bmatrix}#1\end {bmatrix}}\) \(\newcommand {\diag }{\textrm {diag}}\) \(\newcommand {\half }{{\textstyle \frac 12}}\) \(\newcommand {\inv }{^{-1}}\) \(\newcommand {\norm }[1]{\|#1\|}\) \(\newcommand {\T }{^T\!}\) \(\newcommand {\NC }{\mbox {NC}}\) \(\newcommand {\NCL }{{\small NCL}}\) \(\newcommand {\BCk }{\mbox {BC$_k$}}\) \(\newcommand {\LCk }{\mbox {LC$_k$}}\) \(\newcommand {\NCk }{\mbox {NC$_k$}}\)

Algorithm NCL for constrained optimization:
Solving the linear systems within interior methods

Michael Saunders, Ding Ma, Alexis Montoison, Dominique Orban

Abstract

1 Constrained optimization

We consider large smooth constrained optimization problems of the form

.
\(\NC \) \(\min _{x \in \Re ^n}\) \(\phi (x)\)
subject to \(c(x) = 0, \ \ \ell \le x \le u\)

where \(\phi (x)\) is a smooth scalar function and \(c(x) \in \Re ^m\) is a vector of smooth linear or nonlinear functions. We assume that first and second derivatives are available. If the constraints include any linear or nonlinear inequalities, we assume that slack variables have already been included as part of \(x\), and appropriate bounds are included in \(\ell \) and \(u\). Problem NC is general in this sense.

2 LANCELOT

LANCELOT [1, 2, 6] is designed to solve large, smooth constrained optimization problems. For problem NC, LANCELOT solves a sequence of about 10 BCL (Bound-Constrained augmented Lagrangian) subproblems of the form

.
\(\BCk \) \(\min _{x \in \Re ^n}\) \(\phi (x) - y_k^T c(x) + \half \rho _k c(x)\T c(x)\)
subject to \(\ell \le x \le u,\)

where \(y_k\) is an estimate of the dual variables for the nonlinear constraints \(c(x)=0\), and \(\rho _k > 0\) is a penalty parameter. After BCk is solved (perhaps approximately) to give a subproblem solution \(x_k^*\), the size of \(\norm {c(x_k^*)}\) is used to define BC\(_{k+1}\):

3 Algorithm NCL

Algorithm NCL [7] mimics LANCELOT with only one change: subproblem BCk is replaced by the equivalent larger subproblem

.
\(\NCk \) \(\min _{x \in \Re ^n,\,r \in \Re ^m}\) \(\phi (x) + y_k\T r + \half \rho _k r\T r\)
subject to \(c(x) + r = 0, \quad \ell \le x \le u.\)

Given a subproblem solution \((x_k^*,r_k^*)\), the choice between updating \(y_k\) or increasing \(\rho _k\) is based on \(\norm {r_k^*}\). We expect \(\norm {r_k^*} \rightarrow 0\), so that \(x_k^*\) is increasingly close to solving NC.

The active-set solvers CONOPT [3], MINOS [8], and SNOPT [13] are nominally applicable to NCk . Their reduced-gradient algorithms would naturally choose \(r\) as basic variables, and the \(x\) variables would be either superbasic (free to move) or nonbasic (fixed at one of the bounds). However, this is inefficient on large problems unless most bounds are active at the subproblem solution \(x_k^*\).

In contrast, interior methods welcome the extra variables \(r\) in NCk , as explained in [7]:

4 The linear system in nonlinear interior methods

For simplicity, we assume that the bounds \(\ell \le x \le u\) are simply \(x \ge 0\). Let \(y\) and \(z\) be dual variables associated with the constraints \(c(x)=0\) and \(x\ge 0\) respectively, and let \(X = \diag (x)\), \(Z = \diag (z)\). When a nonlinear primal-dual interior method such as IPOPT [4] or KNITRO [5] is applied to NCk , each search direction is obtained from a linear system of the form

\begin{equation} \label {K3} \tag {K3} \pmat { -(H + X\inv Z) & & J^T \\ & \!\!\!-\!\rho _k I & I \\ J & I & } \pmat {\Delta x \\ \Delta r \\ \Delta y} = \pmat {r_2 \\ r_3 \\ r_1}. \end{equation}

Although this system large, the additional variables \(\delta r\) do not damage the sparsity of the matrix. IPOPT and KNITRO have performed well on problem NCk as it stands, solving systems (K3).

5 Reducing the size of (K3)

For all NCk , \(\rho _k \ge 1\) (and ultimately \(\rho _k \gg 1\)), and it is stable to eliminate \(\Delta r\) from (K3) to obtain

\begin{equation} \label {K2} \tag {K2} \pmat { -(H + X\inv Z) & J^T \\ J & \frac {1}{\rho _k}I } \pmat {\Delta x \\ \Delta y} = \pmat {r_2 \\ r_1 + \frac {r_3}{\rho _k}}, \qquad \Delta r = \frac {1}{\rho _k} (\Delta y - r_3). \end{equation}

If the original problem is convex, \(H + X\inv Z\) is symmetric positive definite (SPD) and it is possible to eliminate \(\Delta y\):

\begin{equation} \label {K1} \tag {K1} (H + X\inv Z + \rho _k J^T J) \Delta x = -r_2 + J^T(r_3 + \rho _k r_1), \qquad \Delta y = r_3 + \rho _k(r_1 - J \Delta x). \end{equation}

These reductions would require recoding of IPOPT and KNITRO (which is not likely to happen), but they are practical within the nonlinear interior solver MadNLP [11].

6 MadNLP, MadNCL, and GPUs

Algorithm NCL has been implemented as MadNCL [10], using MadNLP [11] as the solver for subproblems NCk . MadNLP has the option of solving (K2) or (K1) rather than (K3).

For convex problems, system (K2) is symmetric quasidefinite (SQD) [14] and it is practical to use sparse indefinite LDLT factorization. MadNLP implements this option using the cuDSS library [12, 9] to utilize GPUs. Alternatively (and again for convex problems), (K1) is SPD and MadNLP can use the cuDSS sparse Cholesky LDLT factorization (unless \(J^T J\) is dense).

Thus, for certain large optimization problems, MadNCL is a solver that employs GPUs and in general is much faster than IPOPT or KNITRO . Numerical results are presented for solving security constrained optimal power flow (SCOPF) problems on GPUs.

References