PDF
\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand {\bm }[1]{\boldsymbol {#1}}\)

Towards Efficient Algorithms for Approximately Solving (Overdetermined) Systems of Polynomial Equations

N. Govindarajan, R. Widdershoven, L. De Lathauwer

Abstract

We revisit the age-old problem of solving a system of multivariate polynomial equations. This problem can be viewed as a simultaneous generalization of solving linear systems and finding roots of univariate polynomials. It is well-known that the aforementioned special cases have widespread applications in science and engineering. It should come as no surprise that the same is true about the more general version of this problem. This is particularly the case in the noisy overdetermined setting, which extends the familiar engineering notion of solving a system of linear equations in a “least-squares” fashion to the polynomial case.

Classically, solving a system of polynomial equations belongs to the field of computational algebraic geometry. The literature advocates two major approaches to the problem. The first approach involves homotopy continuation, where the roots of the desired system are found by continuous deformation of a starting system for which the roots are already known. The second approach, which is more in line with our work, reduces the problem to an eigenvalue problem. The algebraic approach has its origins in resultant theory, tracing back to original contributions by B’ezout, Sylvester, Cayley, and Macaulay. The main idea behind this line of attack is to unveil the structure of the quotient algebra of the ideal generated by the polynomial system. Solutions of the system are subsequently extracted from the eigen-structure of the generated multiplication tables [1].

Up to this point, the literature has primarily focused on the noiseless square case. Herein, it is assumed that the coefficients of the polynomials are known with full precision and the number of equations equals the number of unknowns. On the contrary, a critical component of engineering applications is the estimation of system parameters from an overcomplete set of equations corrupted by noise. In the case of linear systems, numerical linear algebra already provides effective methods to deal with such problems. Analogous methods to treat the more general polynomial case are, however, far fewer, and relatively underdeveloped.

Our work hopes to fill this gap by taking a fresh perspective on the problem. The cornerstone of our proposed framework is the (tensor-based) Macaulay method [4]. This method rests on the idea that a basis for the quotient algebra can be formed by computing a numerical null space of a resultant map. The classical Macaulay matrix, which generalizes the Sylvester resultant matrix of two univariate polynomials to the multivariate case, is a canonical example of such a map. The fact that the null space is obtained through numerical means is an essential ingredient in enabling the solvability of polynomial systems in an approximate sense [5].

This core key feature of the methodology is best explained through an example. Suppose that one wants to solve the overdetermined linear system

\[\left [ \begin {array}{c|cc} -3 & -1 & -2 \\ -2 & -1 & 1 \\ 1 & 7 & 1 \end {array}\right ] \left [ \begin {array}{c} 1 \\ \hline x \\ y \end {array} \right ] = \left [ \begin {array}{c} 0 \\ 0 \\ 0 \end {array} \right ]\]

in a total least-squares sense. As it is well-known, the solution is trivially found by retrieving the smallest right-singular vector of the matrix on the left-hand side of the expression. Since the corresponding singular value is strictly positive, the right-singular vector (after normalization) will only yield an approximate solution of the system. Interesetingly, a similar strategy may be employed to find approximate solutions for polynomial systems. For example, the overdetermined polynomial system

\begin{equation} \begin{array}{c} \scriptstyle p_1(x,y)\\ \scriptstyle p_2(x,y)\\ \scriptstyle p_3(x,y) \end {array} \left [ \begin{array}{c|cc|ccc} -3 & -1 & -2 & 4 & 6 & 7 \\ -2 & -1 & 1 & 3 & -7 & 5 \\ 1 & 7 & 1 & -8 & 3 & 1 \end {array} \right ] \left [ \begin{array}{c} 1 \\ \hline x \\ y \\\hline x^2 \\ xy \\ y^2 \end {array} \right ] = \left [ \begin{array}{c} 0 \\ 0 \\ 0 \end {array} \right ] \label {eq:poly} \end{equation}

has an exact solution at \((x,y) = (1,0)\), which is also the only solution of the system. This single solution disappears under small perturbations of the coefficients, but one can still view \((x,y) = (1,0)\) as an approximate solution of the noise-perturbed system. After all, the Vandermonde vector evaluated at this point, i.e., \(\begin {bmatrix} 1 & 1 & 0 & 1 & 0 & 0 \end {bmatrix}^{\top }\), is still an approximate null vector for the noise perturbed matrix. It can therefore be subsequently used to derive approximate solutions.

Unlike the linear system, there are several complications that one has to treat in the polynomial case. Firstly, the existence of other artificial null vectors in (1) make the retrieval of the Vandermonde solution vector impossible. This first complication is resolved by adding additional equations to the system:

\begin{equation*} \begin{array}{c} \scriptstyle p_1(x,y)\\ \scriptstyle p_2(x,y)\\ \scriptstyle p_3(x,y)\\ \scriptstyle xp_1(x,y)\\ \scriptstyle xp_2(x,y)\\ \scriptstyle xp_3(x,y)\\ \scriptstyle yp_1(x,y)\\ \scriptstyle yp_2(x,y)\\ \scriptstyle yp_3(x,y)\\ \end {array} \left [ \begin{array}{c|cc|ccc|cccc} -3 & -1 & -2 & 4 & 6 & 7 & 0 & 0 & 0 & 0 \\ -2 & -1 & 1 & 3 & -7 & 5 & 0 & 0 & 0 & 0\\ 1 & 7 & 1 & -8 & 3 & 1 & 0 & 0 & 0 & 0 \\ \hline 0 & -3 & 0 & -1 & -2 & 0 & 4 & 6 & 7 & 0 \\ 0 & -2 & 0 & -1 & 1 & 0 & 3 & -7 & 5 & 0 \\ 0 & 1 & 0 & 7 & 1 & 0 & -8 & 3 & 1 & 0 \\ \hline 0 & 0 & -3 & 0 & -1 & -2 & 0 & 4 & 6 & 7\\ 0 & 0 & -2 & 0 & -1 & 1 & 0 & 3 & -7 & 5\\ 0 & 0 & 1 & 0 & 7 & 1 & 0 & -8 & 3 & 1\\ \end {array} \right ] \left [ \begin{array}{c} 1 \\ \hline x \\ y \\\hline x^2 \\ xy \\ y^2 \\\hline x^3 \\ x^2y \\ xy^2 \\ y^3 \end {array} \right ] = \left [ \begin{array}{c} 0 \\ 0 \\ 0 \\\hline 0 \\ 0 \\ 0 \\\hline 0 \\ 0 \\ 0 \end {array} \right ]. \end{equation*}

The above matrix presents an example of a Macaulay matrix at a certain degree. This Macaulay matrix contains the Vandermonde Vector

\[\left [\begin {array}{c|cc|ccc|cccc} 1 & 1 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 \end {array}\right ]^{\top }\]

as the only null vector up to scaling ambiguity. Secondly, a polynomial system can, in general, admit multiple solutions. Consequently, the numerically obtained (approximate) null basis will not be immediately in Vandermonde form. This second complication is resolved by performing an additional unmixing operation to retrieve the Vandermonde basis. This Vandermonde basis reconstruction can be viewed as a multi-dimensional harmonic retrieval problem [3], and can be effectively solved by computing a canonical polyadic decomposition of a tensor formed from the obtained null space basis [4].

In general, solutions for a polynomial system can be determined in two computational steps: (i) compute the null space of the Macaulay matrix, and (ii) compute a polyadic decomposition of a tensor to retrieve the solutions. In the case of overdetermined systems that only possess a handful of solutions, the first step is a major computational bottleneck. Macaulay matrices grow very rapidly in size for even moderately-sized polynomial systems. This makes the null space computation prohibitively expensive. To get an impression of the complexity, a polynomial in \(N\) variables of degree \(D\) contains \(N+D \choose N\) monomial terms. The Macaulay matrix at degree \(D_{\mathrm {mac}} \geq D\) of \(S\) such polynomials is the matrix that is constructed from the polynomial coefficients of the system \(\lbrace p_s\rbrace ^{S}_{s=1}\) such that its rows span the set of polynomials

\begin{equation} \left \lbrace p:= \sum ^S_{s=1} h_s \cdot p_s: \quad \operatorname {deg}(p) \leq D_{\mathrm {mac}} \right \rbrace . \end{equation}

Both the numbers of rows and columns of this matrix grow combinatorially in size with respect to \(D_{\mathrm {mac}}\), \(D\), and \(N\).

Fortunately, the Macaulay matrix is highly structured. In the monomial basis, the Macaulay matrix possesses multilevel Toeplitz structures. It also has recursive properties, and in certain cases, the matrix can be highly sparse. Furthermore, the Chebyshev variant of the Macaulay matrix is multilevel Toeplitz-plus-Hankel. Subsequently, much of our research has been dedicated to developing efficient null-space computation algorithms that exploit the structures in the matrix. Our efforts in this area have led to some interesting work; some of which is still ongoing research:

At the Householder Symposium, our goal is to share this research with our colleagues.

References