add some discussion of formalizing symmetry breaking

bsubercaseaux · Jun 9, 2024 · 38e267d · 38e267d
1 parent 0653dab
commit 38e267d
Showing 1 changed file with 56 additions and 14 deletions.
diff --git a/ITP/symmetry-breaking.tex b/ITP/symmetry-breaking.tex
@@ -58,22 +58,41 @@
 
 
 \begin{proof}[Proof Sketch]
-The proof proceeds in 6 steps, illustrated in~\Cref{fig:symmetry-breaking}. In each of the steps, we will construct a new list of points that is $\sigma$-equivalent to the previous one, and the last one will be in canonical position.\footnote{Even though we defined $\sigma$-equivalence for sets of points, our formalization goes back and forth between sets and lists. Given that symmetry breaking distinguishes between the order of the points e.g., $x$-order, this proof proceeds over lists. All permutations of a list are immediately $\sigma$-equivalent.}
-The main justification for each step is that, given that the function $\sigma$ is defined as a sign of the determinant, applying transformations that preserve (or, when \lstinline|parity := true|, uniformly reverse) the sign of the determinant will preserve (or uniformly reverse) the values of $\sigma$. In particular, given the identity $\det(AB) = \det(A)\det(B)$, if we apply a transformation to the points that corresponds to multiplying by a matrix $B$ such that $\det(B) > 0$, then $\sign(\det(A)) = \sign(\det(AB))$, and thus orientations will be preserved.
-In step 1, we transform the list of points so that no two points share the same $x$-coordinate. This can be done by applying a rotation to the list of points, which corresponds to multiplying by a rotation matrix.
-Rotations always have determinant $1$. 
-In step 2, we translate all points by a constant vector $t$, by multiplying by a translation matrix, so that the left most point gets position $(0, 0)$, and naturally every other point will have a positive $x$-coordinate.
-Let $L_2$ be the list of points after this transformation, excluding $(0,0)$ which we will denote by $p_1$.
-Then, in step 3, we  apply the projective transformation $f: (x, y) \mapsto (y/x, 1/x)$ to every point in $L_2$, showing that this preserves orientations within $L_2$.
-To see that this mapping is a $\sigma$-equivalence consider that 
+The proof proceeds in 6 steps, illustrated in~\Cref{fig:symmetry-breaking}.
+In each of the steps, we will construct a new list of points that is $\sigma$-equivalent to the previous one,
+and the last one will be in canonical position.
+\footnote{Even though we defined $\sigma$-equivalence for sets of points,
+our formalization goes back and forth between sets and lists.
+Given that symmetry breaking distinguishes between the order of the points
+e.g., $x$-order, this proof proceeds over lists.
+All permutations of a list are immediately $\sigma$-equivalent.}
+The main justification for each step is that,
+given that the function $\sigma$ is defined as a sign of the determinant,
+applying transformations that preserve (or, when \lstinline|parity := true|, uniformly reverse)
+the sign of the determinant will preserve (or uniformly reverse) the values of $\sigma$.
+
+For example, given the identity $\det(AB) = \det(A)\det(B)$,
+if we apply a transformation to the points that corresponds to multiplying by a matrix $B$ such that $\det(B) > 0$,
+then $\sign(\det(A)) = \sign(\det(AB))$, and thus orientations will be preserved.
+\textbf{Step 1}: we transform the list of points so that no two points share the same $x$-coordinate.
+This can be done by applying a rotation to the list of points, which corresponds to multiplying by a rotation matrix.
+Rotations always have determinant $1$.
+\textbf{Step 2}: we translate all points by a constant vector $t$, by multiplying by a translation matrix,
+so that the left most point gets position $(0, 0)$, and naturally every other point will have a positive $x$-coordinate.
+
+Let $L_2$ be the list of points after Step 2, excluding $(0,0)$ which we will denote by $p_1$.
+\textbf{Step 3}: we apply the projective transformation $f: (x, y) \mapsto (y/x, 1/x)$ to every point in $L_2$,
+showing that this preserves orientations within $L_2$.
+To see that this mapping is a $\sigma$-equivalence consider that
 \[
 \begin{multlined}
  \sign \det \begin{pmatrix} p_x & q_x & r_x \\ p_y & q_y & r_y \\ 1 & 1 & 1 \end{pmatrix} =  \sign \det \left( \begin{pmatrix} 0 & 0 & 1 \\ 1 & 0 & 0\\ 0 & 1 & 0 \end{pmatrix}  \begin{pmatrix} \nicefrac{p_y}{p_x} & \nicefrac{q_y}{q_x} & \nicefrac{r_y}{r_x} \\ \nicefrac{1}{p_x} & \nicefrac{1}{q_x} & \nicefrac{1}{r_x} \\ 1 & 1 & 1 \end{pmatrix}  \begin{pmatrix} p_x & 0 & 0 \\ 0 & q_x & 0\\ 0 & 0 & r_x \end{pmatrix} \right)\\
                         = \sign \left(1 \cdot \det  \begin{pmatrix} \nicefrac{p_y}{p_x} & \nicefrac{q_y}{q_x} & \nicefrac{r_y}{r_x} \\ \nicefrac{1}{p_x} & \nicefrac{1}{q_x} & \nicefrac{1}{r_x} \\ 1 & 1 & 1 \end{pmatrix} \cdot  p_x q_x r_x  \right) = \sign \det \begin{pmatrix} p_y/p_x & q_y/q_x & r_y/r_x \\ 1/p_x & 1/q_x & 1/r_x \\ 1 & 1 & 1 \end{pmatrix}.
                         %  \tag{As $p_x q_x r_x > 0$ by step 2}
 \end{multlined}
 \]
-To preserve orientations with respect to the leftmost point $(0, 0)$, we set $f( (0, 0)) = (0, \infty)$, a special point that is treated separately as follows.
+To preserve orientations with respect to the leftmost point $(0, 0)$, we set $f( (0, 0)) = (0, \infty)$,
+a special point that is treated separately as follows.
 As the function $\sigma$ takes points in $\mathbb{R}^2$ as arguments,
 we need to define an extension
 \(
@@ -83,13 +102,25 @@
   \end{cases},
 \)
 We then show that $\sigma((0, 0), q, r) = \sigma_{(0, \infty)}(f(q), f(r))$ for all points $q, r \in L_2$. 
-In step 4, we sort the list $L_2$ by $x$-coordinate in increasing order, thus obtaining a list $L_3$.
-This can be done while preserving $\sigma$-equivalence because sorting corresponds to a permutation, and all permutations of a list are $\sigma$-equivalent by definition.
-In step 5, we check whether the \textsf{Lex order} condition above is satisfied in $L_3$, and if it is not, we reflect the pointset, which preserves $\sigma$-equivalence with \lstinline|parity := true|.
+
+\textbf{Step 4}: we sort the list $L_2$ by $x$-coordinate in increasing order,
+thus obtaining a list $L_3$.
+This can be done while preserving $\sigma$-equivalence because sorting corresponds to a permutation,
+and all permutations of a list are $\sigma$-equivalent by definition.
+\textbf{Step 5}: we check whether the \textsf{Lex order} condition above is satisfied in $L_3$,
+and if it is not, we reflect the pointset,
+which preserves $\sigma$-equivalence with \lstinline|parity := true|.
 Note that in such a case we need to relabel the points from left to right again.
-In step 6, we bring point $(0, \infty)$ back into the range by first finding a constant $c$ such that all points in $L_3$ are to the right of the line $y=c$, and then finding a large enough value $M$ such that $(c, M)$ has the same orientation with respect to the other points as $(0, \infty)$ did, meaning that 
+
+\textbf{Step 6}: we bring point $(0, \infty)$ back into the range
+by first finding a constant $c$ such that all points in $L_3$ are to the right of the line $y=c$,
+and then finding a large enough value $M$ such that $(c, M)$ has the same orientation
+with respect to the other points as $(0, \infty)$ did,
+meaning that 
 \(\sigma((c, M), q, r) = \sigma_{(0, \infty)}(q, r)\) for every $q, r \in L_3$.
-Finally, we note that the list of points obtained in step 6 satisfies the \text{CCW-order} property by the following reasoning:
+
+Finally, we note that the list of points obtained in step 6
+satisfies the \text{CCW-order} property by the following reasoning:
 if $1 < i < j \leq n$ are indices, then 
 \begin{align*}
   \sigma(p_1, p_i, p_j) = 1 &\iff \sigma((c, M), p_i, p_j) = 1\\
@@ -100,3 +131,14 @@
 This concludes the proof.
 \end{proof}
 
+We had quite some difficulty formalizing the symmetry-breaking argument.
+We spent many hours around a whiteboard,
+simplifying the argument to minimize the amount of theory we would need to develop.
+The projective transformation in Step 3 was particularly difficult to wrangle.
+This step, and those after it, requires careful bookkeeping around the distinguished point $p_1$.
+
+Once these details were worked out,
+the proof was relatively straightforward, albeit long and tedious.
+Each step of the proof justifies not only that a new property is achieved,
+but also that all properties from previous steps are preserved.
+As a result, the proof burden was substantial.