Affine normalization of quadrics

This is an explanation of the function ddg.math.symmetric_matrices.affine_normalization(). To classify quadrics up to a projective transformation, we can simply compute their signatures (up to global sign). But what if we are interested in classifying quadrics up to an affine transformation? This is less commonly taught, and we will now explain this. It is useful for applications like explicitly parametrizing quadrics.

The problem

We are given a symmetric matrix \(Q \in \R^{(n+1) \times (n+1)}\) defining a quadric in \(\RP^n\) and want to bring it to some kind of normal form. We will see that it is always possible to bring it to either a diagonal form with entries 1, -1 or 0 on the diagonal or to the form

\[\begin{split}\begin{pmatrix} D & & \\ & 0 & 1\\ & 1 & 0\\ \end{pmatrix}\end{split}\]

where \(D\) is again a diagonal matrix with entries 1, -1 or 0. We can pay further attention to the global sign and to the order of the entries on the diagonals to truly identify quadrics up to affine transformation, affine signatures and matrices in normal form, but we will not deal with this here. See Signature and AffineSignature for more information on this.

Preliminaries

We will decompose \(Q\) as

\[\begin{split}\begin{pmatrix} Q_1 & q \\ q^\transp & r \end{pmatrix}\end{split}\]

with \(Q_1 \in \R^{n \times n}\) symmetric, \(q \in \R^n\) and \(r \in \R\). Our goal is to find an affine transformation in homogeneous coordinates that brings it to normal form. This amounts to finding a matrix \(A \in \R^{(n+1) \times (n+1)}\) of the form

\[\begin{split}A = \begin{pmatrix} A_1 & b \\ 0 & 1 \end{pmatrix}\end{split}\]

With \(A_1 \in \operatorname{GL}(n)\) and \(b \in \R^n\) such that

\[\begin{split}A^\transp Q A = \begin{pmatrix} A_1{}^\transp Q_1 A_1 & A_1{}^\transp (Q_1b + q) \\ (Q_1 b + q)^\transp A_1 & \langle b, Q_1 b + 2q\rangle + r \end{pmatrix}\end{split}\]

is in normal form.

The non-parabolic case

If we can find \(b\) so that

\[A_1{}^\transp(Q_1 b + q) = 0 \iff Q_1 b + q = 0\]

(since \(A_1\) is invertible), we are in the non-parabolic case. Here, we can just diagonalize \(Q_1\) and are already almost done. We now have a diagonal matrix

\[\begin{split}D \coloneqq \begin{pmatrix} D_1 & 0 \\ 0 & \langle b, q \rangle + r \end{pmatrix}\end{split}\]

with entries \(d_1,\dots,d_{n+1} \in \R\) which we still need to normalize to 1, -1 or 0. To do this, we can just transform with a diagonal matrix \(C = (c_{ij}) \in \R^{n+1, n+1}\) with diagonal entries

\[\begin{split}c_{ii} \coloneqq \begin{cases} \frac{1}{\sqrt{|d_i|}}, & \text{if \(d_i \ne 0\),} \\ 1 & \text{else}. \end{cases}\end{split}\]

Note that \(C\) is also an affine transformation.

The parabolic case

If \(Q_1 b + q = 0\) is not solvable, we are in the parabolic case and we would like to get \(A_1{}^\transp(Q_1 b + q) = e_n\) (the nth standard basis vector) and \(\langle b, Q_1 b + 2q \rangle + r = 0\) to get to normal form.

Step 1: Reducing \(q\)

We want \(A_1{}^\transp(Q_1 b + q) = e_n\). Let \(k \coloneqq \dim\ker Q_1 > 0\) (otherwise we would be in the non-parabolic case).

We start as in the non-parabolic case by diagonalizing \(Q_1\) with an orthogonal matrix \(O_1\). We choose it so that the eigenvalues on the diagonal of \(O_1{}^\transp Q_1 O_1\) are sorted such that the 0s are at the end. We get that \(O_1{}^\transp\) maps \(\ker Q_1\) to \(\{0\} \times \R^k \subset \R^n\).

We can’t solve \(Q_1 b + q = 0\) in this case, but we choose \(b \coloneqq Q_1{}^+(-q)\), where \(Q_1{}^+\) is the Moore-Penrose pseudoinverse (i.e. \(b\) is the least-squares solution of the equation). Using the general pseudoinverse identity \(Q_1{}^\transp(I_n - Q_1Q_1{}^+) = 0\), we see that

\[\begin{split}\begin{aligned} Q_1(Q_1 b + q) &= Q_1(Q_1(-Q_1{}^+)q + q) \\ &= Q_1{}^\transp(I - Q_1Q_1{}^+)q \\ &= 0, \end{aligned}\end{split}\]

i.e. \(Q_1 b + q \in \ker Q_1\). Together, this implies that there exists a vector \(\bar{q} \in \R^k\) such that

\[\begin{split}O_1{}^\transp (Q_1 b + q) = \begin{pmatrix} 0 \\ \bar{q} \end{pmatrix} \in \{0\} \times \R^k \subset \R^n.\end{split}\]

We now choose a basis \((v_1,\dots,v_{k-1},\bar{q})\) of \(\R^k\), write this basis as the rows of a matrix \(R\) and define

\[\begin{split}B_1 \coloneqq \begin{pmatrix} I_{n-k} & 0 \\ 0 & R^{-1} \end{pmatrix}.\end{split}\]

Since \(B_1{}^\transp O_1{}^\transp (Q_1 b + q) = e_n\), the affine transformation \(u \mapsto O_1 B_1 u + b\) diagonalizes \(Q_1\) and transforms \(q\) to \(e_n\).

Step 2: Eliminating \(r\)

We want \(\langle b, Q_1 b + 2q \rangle + r = 0\). After the last step, we have a matrix of the form

\[\begin{split}\begin{pmatrix} D & & \\ & 0 & 1 \\ & 1 & \tilde{r} \\ \end{pmatrix}.\end{split}\]

It is now easy to see that the affine transformation \(u \mapsto u - \frac{\tilde r}{2}e_n\) eliminates \(\tilde{r}\).

We can now, just as in the non-parabolic case, rescale the nonzero diagonal entries to 1 or -1 and we are done.