Solving polynomials using Gröbner basis
Algebraic geometry is a field of mathematics that studies properties of geometric objects (algebraic varieties) defined by polynomial equations, using algebraic methods. In general, when modeling certain phenomena, you inevitably end up with differential equations or polynomial equations. In most engineering contexts, we often use differential equations for modeling, but once you use polynomials, the problem falls under algebraic geometry. A typical example is the inverse kinematics problem, and another one—my own entry point into algebraic geometry—is the geometry involved in computer vision. In addition, cryptography also deals quite a bit with polynomials, and CAD systems that involve freeform surfaces eventually rely on algebraic solutions as well.
However, unless you’re in a very specialized group, most engineers are not that familiar with algebraic concepts like groups, rings, and fields. I wasn’t, either. In reading research papers, I often encountered a one-line mention like “we used this method,” but it was never clear why or how it worked. Since I didn’t even know the relevant terminology at the time, I felt quite lost and couldn’t make sense of it with my existing knowledge.
This post is written from the standpoint of an engineer who isn’t used to rigorous mathematical language, prioritizing a tool-oriented perspective over elegant definitions and full rigor. It contains only some key terms, an outline of the flow, and a few definitions. I wrote it thinking that if a post like this had been available online, it would have saved me a lot of time. The excellent reference, Using Algebraic Geometry used row spaces, to describe concepts. However, from a linear-algebra perspective, I found it more intuitive to transpose matrices and use column spaces. That doesn’t cause any real issues for the overall argument, but if you consult the original text, keep that difference in mind.
I omitted details that aren’t critical to the main discussion, although I’ve listed a few additional keywords below.
Basic defenitions
Here, I’ll start by explaining a few basic concepts. I’ll skip rings and fields; I plan to gloss over them rather loosely in what follows.
Ideal
Let’s consider a system $f(x) = 0$. Suppose $f$ is the following set of polynomials.
$$ f=(f_1,f_2,\cdots,f_n) $$Using these polynomials as generators, you can form new polynomials by combining them. The set of all such combinations is called an ideal. We denote the ideal generated by a set of polynomials by $\langle \cdot \rangle$.
$$ I = \langle f \rangle = \langle f_1,f_2,\cdots,f_n \rangle $$Also, let $[x]$ be a polynomial that can be expressed in terms of $x$ alone. (e.g $(x^3+4x) \in [x]$)
$$ I=\lbrace \sum_{i=1}^n { h_i f_i \ \big|\ h_i \in k[x] } \rbrace $$For example, consider the system $f(x,y)$.
$$ f(x,y)=\begin{cases} f_1 = x \newline f_2 = x^2 \newline f_3 = xy \newline f_4 = y^2 \end{cases} $$Regarding the ideal $I=\langle f \rangle$, the following terms are included.
$$ 2x,\ x^3y,\ 40x^{22}y^{21}+13xy $$The following terms aren’t.
$$ y, 4 $$Additional keywords : left ideal, right ideal, ring
Membership problem
Let us consider an ideal $I$ generated by $f$. The question then arises: is a new polynomial $p$ a member of this ideal? Naturally, if $p$ is generated by $f$, it belongs to $I$. But what if $p$ is not a combination generated by $f$? This is known as the Membership Problem, and it is by no means trivial.
Let’s look at example:
$$ f(x,y)=\begin{cases} f_1 = xy + x \newline f_2 = xy + x + y + 1 \end{cases} $$In this example, the polynomial $p=y+1$ belongs to the ideal generated by $f$. Specifically, it can be represented as $f_2-f_1$, which is some form of linear combination of the generators. While this looks obvious here, things can get quite complicated when you have many polynomials and more complex relationships. We can express the membership test in the following way:
$$ \mathrm{p} = p \text{ mod } I $$When $\mathrm{p}=0$, we say $p \in I$. We will revisit the specifics of this remainder operation later. Such operations allow us to define a quotient ring by the ideal; the remainder in that context is called the normal form (Maple - Normalf
).
Monomial Ordering
In what follows, we need to define the leading term for certain algorithms, and the result will depend on how the monomials are ordered. Therefore, we need a systematic way to define the ordering of monomials. Several common methods are typically used. As an example, consider polynomials in variables $x,y,z$, up to degree 3. We also need to define an order among the individual variables; here, let’s assume $x \gt y \gt z$.
- Lexicographic A.k.a.
Lex
:
Monomials are ordered by comparing exponents in order, starting from $x$.
$x^3,x^2y,x^2z,x^2,xy^2,xyz,xy,xz^2,xz,x,y^3,y^2z,y^2,yz^2,yz,y,z^3,z^2,z,1$ - Degree Lexicographic A.k.a.
Graded Lexicographic
,grlex
,deglex
:
We first compare the total degree of each monomial. If there is a tie, we then compare exponents in lexicographic order
$x^3,x^2y,x^2z,xy^2,xyz,xz^2,y^3,y^2z,yz^2,z^3,x^2,xy,xz,y^2,yz,z^2,x,y,z,1$ - Degree Reverse Lexicographic A.k.a Graded Reverse Lexicographic, grevlex, degrevlex :
We first compare total degrees. If there is a tie, the one whose exponent in the last variable $z$ is smaller is considered greater.
$x^3,x^2y,xy^2,y^3,x^2z, xyz, y^2z,xz^2, yz^2,z^3,x^2,xy,y^2,xz,yz,z^2,x,y,z,1$
Personally, I found the difference between grlex and grevlex a bit confusing—especially grevlex, which feels like a double negative. Here are concrete examples:
grlex
: $x^2z^2 \gt xy^2z$ (higher power of $x$ broke tie)grevlex
: $xy^2z \gt x^2z^2$ (lower power of $z$ broke tie)
Given a particular monomial ordering, the leading term of a polynomial is the first monomial according to that ordering; we often denote it by $\mathrm{LT}(\cdot)$ or a similar notation.
S-polynomial
Reading the formula can be a bit cumbersome, so in essence, the S-polynomial is the polynomial obtained by “canceling out” each polynomial’s leading term using the other polynomial. You can just skip the definition below and just look at the examples.
As a definition, formally, if we ignore coefficients and call the leading term’s monomial part the Leading Monomial $\mathrm{LM}(\cdot)$, and let $\mathrm{LCM}(\cdot)$ denote the least common multiple of monomials, then for two polynomials $p$ and $q$, the S-polynomial S$S(\cdot, \cdot) is defined as:
$$ S(p,q)= {{\mathrm{LCM}(\mathrm{LM}(p),\mathrm{LM}(q))}\over{\mathrm{LT}(p)}}\cdot p - {{\mathrm{LCM}(\mathrm{LM}(p),\mathrm{LM}(q))}\over{\mathrm{LT}(q)}}\cdot q $$As an example :
$$ p=xy+x+1, q=x+y^2-1 $$- Lexicographic order $x \gt y$
- Degree Lexicographic order $x \gt y$
From this example, you can see not only the basic computation steps but also how the choice of monomial ordering affects the resulting S-polynomial.
Buchberger’s algorithm
Buchberger’s Algorithm is an algorithm for computing Gröbner bases. (Gröbner bases were introduced by Buchberger, who named them after his doctoral advisor, Gröbner.) We’ll discuss the formal definition of a Gröbner basis a bit later; for now, let’s just note that it is a basis of a particular ideal. It may feel more intuitive to look at the method first and then see the definition.
Buchberger’s method works as follows:
Input : $f = \lbrace f_1,f_2,\cdots , f_n\rbrace$, $I = \langle f \rangle$
Output : Gröbner basis $g = \lbrace g_1,g_2,\cdots , g_n\rbrace$
- $f := g$
- For each pair of polynomials $p,q$ in $g$, compute the S-polynomial $S$.
- $r = S \text{ mod }g$
- If $r=0$, do nothing.
If $r\neq 0$, add it to $g$. - Repeat steps 2 through 4.
Let’s look at an example. In this example, we will use lexical ordering. The polynomials used in the example are taken from this video . The visualization is relatively neat, and the elimination process occurs multiple times at a suitable level of complexity.
$$ f_1 = x^2+y^3-1 \newline f_2 = xy-x+y^2 $$step1.
$$ g=\lbrace g_1,g_2\rbrace:=\lbrace f_1,f_2\rbrace $$step2.
$$ \begin{align*} S_{12} &= yg_1 - xg_2 \newline &= x^2y+y^4-y - x^2y - x^2 + x^2 - xy^3 \newline &= x^2 - xy^2 + y^4 -y \end{align*} $$step3.
$$ \begin{align*} S_{12} & = x^2 - xy^2 + y^4 -y \newline & = g_1 - (y+1)g_2 + (-x+y^4+y^2-y+1) \newline \end{align*} $$In the course of computation, we find that the remainder is $(-x+y^4+y^2-y+1)$, which is nonzero. Therefore, we add it to the basis.
$$ S_{12} \text{ mod }g = (-x+y^4+y^2-y+1) \ne 0 \newline g_3 := -x+y^4+y^2-y+1 \newline g = \lbrace g_1,g_2,g_3\rbrace $$We return to step 2.
$$ \begin{align*} S_{13} &= g_1 + xg_3 \newline &= x^2 + y^3 -1 - x^2 + xy^4 +xy^2 - xy+x\newline &= xy^4 + xy^2 - xy + x + y^3 -1 \end{align*} $$step3.
$$ \begin{align*} S_{13} & = xy^4 + xy^2 - xy + x + y^3 -1 \newline & = (y^3+y^2+2y+1)g_2 - 2g_3 + (-y^5+y^4-y^3+y^2-2y+1) \newline \end{align*} $$Likewise, we continue the same procedure.
$$ S_{13} \text{ mod }g = (-y^5+y^4-y^3+y^2-2y+1) \ne 0 \newline g_4 := -y^5+y^4-y^3+y^2-2y+1 \newline g = \lbrace g_1,g_2,g_3,g_4\rbrace $$Returning to step 2 once more.
$$ \begin{align*} S_{23} &= g_2 + yg_3 \newline &= xy-x+y^2 - xy+y^5+y^3-y^2+y\newline &= -x + y^5 + y^3 + y \end{align*} $$step3.
$$ \begin{align*} S_{23} & = -x + y^5 + y^3 + y \newline & = g_3 - g_4 \newline \end{align*} $$This time we find that division proceeds cleanly, so no new basis is added. We skip details for $S_{14},S_{24},S_{34}$, etc., simply noting that they yield no new basis elements. Consequently, the Gröbner basis for this ideal $I$ is as follows. We express it as a monic polynomial set (i.e., the leading term of each polynomial has coefficient 1), but there is no overall difference.
$$ g=\begin{cases} g_1 = x^2+y^3-1 \newline g_2 = xy-x+y^2 \newline g_3 = x-y^4-y^2+y-1 \newline g_4 = y^5-y^4+y^3-y^2+2y-1 \end{cases} $$It seems we have now covered the main background definitions.
I’ve attached a YouTube video below in the hope that it will help clarify above ideas.
Additional keywords : Dickson’s lemma, Hilbert Basis Theorem
Gröbner basis
Definition
The definition of a Gröbner basis is as follows. (Some parts have been omitted, so I do wonder whether this is still a proper “definition.”)
Let $g = \lbrace g_1,\cdots,g_n\rbrace$. Then $g$ is a Gröbner basis for the ideal $I$ if
$\langle LT (I) \rangle = \langle LT(g_1),\cdots, LT(g_n)\rangle$
Where $LT(I)$ denotes the set of leading terms of all polynomials in the ideal,
and $LT(g_n)$ denotes the leading term of $g_n$.
In other words, once you have a Gröbner basis for an ideal, you can represent the entire ideal. As we saw in the Buchberger algorithm example, having the right leading terms is essential for describing the ideal effectively.
Note that the initially assigned set of polynomials $f$ is not necessarily the “optimal” Gröbner basis. Typically, each polynomial’s leading term should not be divisible by that of another polynomial in the basis, but we have not enforced that condition in the initial set $f$. Because the necessity and properties of a Gröbner basis motivate further optimization, we introduce two related concepts:
- Minimal Gröbner basis : A Gröbner basis in which no polynomial’s leading term is divisible by another polynomial’s leading term in the set.
- Reduced Gröbner basis : A minimal Gröbner basis in which every polynomial is monic.
From that perspective, the reduced Gröbner basis in our earlier example is as follows.
$$ g=\begin{cases} g_1 = x-y^4-y^2+y-1 \newline g_2 = y^5-y^4+y^3-y^2+2y-1 \newline \end{cases} $$Ultimately, these polynomials are independent of one another, fully define the ideal, and exhibit its core properties.
Additional keywords : F4, F5, FGLM
Basic solving
Originally, the polynomial system $f$ was as follows:
However, when we computed its Gröbner basis, the system was reduced to the following two equations:
A Gröbner basis provides a way to view the structure of an ideal with greatly reduced complexity. In the above system, solving for $y$ becomes straightforward because we now have a single-variable polynomial in $y$, and we can apply any suitable numerical method. Once $y$ is found, solving for $x$ is trivial.
Since the monomial order determines the leading term, we can think of it as an “elimination order.” In many cases, the chosen order ensures that the last variable in the ordering appears in a univariate polynomial. However, it may happen that the ideal does not contain a univariate polynomial in that variable. In such cases, switching to a different elimination order can sometimes produce the desired univariate polynomial. If not, there are other methods—one of which is simply to pick another form of elimination. We will not go into additional techniques here.
Quotient ring & Normal form
Consider the polynomial ring $k[x]$. By using the division operation together with an ideal $I$, we obtain a quotient ring. In this context, the Krull dimension of the quotient ring tells us the dimension in which solutions may lie; if this Krull dimension is 0, the solution set is finite. We will not go into further details here.
The quotient ring is denoted as
$$ k[x]/I $$The quotient ring is a ring, since the elements of the polynomial ring $k[x]$ “divided” by the given ideal $I$ has ring structure(obviously). For any polynomial $f$, its coset is defined by
$$ [f] = f+I $$Cosets have the following important property:
$$ [f]=[g] \iff f-g \in I $$Thus, the remainder of a polynomial upon division by the ideal $I$ corresponds one-to-one with a coset. Let’s look at a somewhat concrete example (the Gröbner basis here is given in deglex order and is reduced):
$$ g=\begin{cases} g_1 = x^2+xy+y^2-2x-3y \newline g_2 = xy^2-x \newline g_3 = y^3-y \end{cases} $$Here, $LT(g)$ is $\lbrace x^2,xy^2,y^3 \rbrace$. Notice that monomials such as $xy,y^2,x,y,1$ are not in that set of leading terms. We call these “standard monomials.” Using these standard monomials, we can uniquely represent the coset of each polynomial, and it is called normal form.
From a vector space point of view, the quotient ring consists of these normal forms, and there is a monomial basis for these normal forms. For instance, in the quotient ring defined by the Gröbner basis above, $x^2$ can be represented by the normal form $-xy-y^2+2x+3y$. The same logic applies to any polynomial: every polynomial maps (surjects) to exactly one coset in the quotient ring, and any polynomial that can be written as $I+(-xy-y^2+2x+3y)$ lies in the same coset.
Action Matrix
We can now define an action matrix with respect to the standard monomials. An action matrix is a matrix that, when multiplied by the vector of coefficients corresponding to the standard monomials, has the same effect as multiplying a polynomial by a particular variable.
For example:
$$ x \begin{bmatrix} xy \newline y^2 \newline x \newline y \newline 1 \end{bmatrix}=M_x \begin{bmatrix} xy \newline y^2 \newline x \newline y \newline 1 \end{bmatrix} $$Let’s work it out explicitly:
$$ \begin{align*} x\cdot xy &= yg_1-g_2-g_3+(2xy+3y^2-x-y)=2xy+3y^2-x-y \newline x\cdot y^2 &= g_2+x = x \newline x\cdot x &= g_1 - (xy+y^2-2x-3y) = -xy - y^2 +2x + 3y \newline x\cdot y &= xy\newline x\cdot 1 &= x \end{align*} $$Hence, the action matrix $M_x$ can be written as:
$$ M_x = \begin{bmatrix} 2 & 3 & -1 & -1 & 0 \newline 0 & 0 & 1 & 0 & 0 \newline -1 & -1 & 2 & 3 & 0 \newline 1 & 0 & 0 & 0 & 0 \newline 0 & 0 & 1 & 0 & 0 \end{bmatrix} $$Here’s an important insight:
If multiplying by the matrix $M_x$ is equivalent to multiplying by $x$, then the eigenvalues of $M_x$ is the values of $x$.
In other words, once we construct an action matrix, its eigenvalues become the solutions for the variable in some system.
We have now come a long way, so let’s recap the big picture:
Given a set of polynomials, we use their Gröbner basis to represent the ideal they generate. We express the quotient ring using the normal forms determined by the standard monomials that are not in the leading terms of the Gröbner basis. Then, by considering the structure of this normal form, we can construct an action matrix that corresponds to multiplication by a given variable. Solving for the eigenvalues of that matrix effectively solves the polynomial system.
In the example above, the real eigenvalues turn out to be $0,-1,2$. Solving for $y$ is straightforward: each eigenvalue($x$) corresponds to $y$, $0,1,1$. Substituting these back in shows that they satisfy the original system. (In fact, because there’s a term $y^3-y$ in the polynomials, it was already somewhat obvious.)