Revisions to Why is there not a test for diagonalizability of a matrix

deleted 14 characters in body

Source Link

edited Jul 7 at 1:03

432.3k
53
967
1.4k

Here is a salvage: there is a continuous function $f(A)$ of a matrix $A$, even a polynomial function, which checks whether the eigenvalues of $A$ are distinct (over an algebraic closure). Any such matrix is diagonalizable, although not conversely, and this is frequently a useful substitute for the diagonalizability condition in proofs since it implies diagonalizability and is often easier to prove things about.

Namely, we can take $f(A)$ to be the discriminant $\Delta$ of the characteristic polynomial of $A$. This is a homogeneous polynomial of degree $n(n-1)$ in the coefficients of $A$ which is equal to $\prod_{i \neq j} (\lambda_i - \lambda_j)$ where $\lambda_i$ are the eigenvalues of $A$, and so it is nonzero iff $A$ has distinct eigenvalues. For example, when $n = 2$, if we write the entries of $A$ as

$$A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$$

then the characteristic polynomial is $\chi_A(t) = t^2 - (a + d) t + (ad - bc)$, and its discriminant is

$$\Delta = (a + d)^2 - 4(ad - bc) = \boxed{ (a - d)^2 + 4bc }.$$

You can check that this vanishes on all Jordan blocks, and more generally that it is zero iff the characteristic polynomial has a repeated root of $\frac{a+d}{2}$.

As David explains, the problems all stem from diagonalizable matrices with repeated eigenvalues, which are difficult to distinguish from non-diagonalizable matrices (so difficult no continuous function can distinguish them). If we had to work numerically to finite precision then it would actually be impossible to distinguish these, and it would even be impossible to numerically distinguish non-diagonalizable matrices from diagonalizable matrices with distinct eigenvalues some of which are very close together.

Edit: Here is some more discussion. We can consider Hagen von Eitzen's example of the matrices

$$X(\varepsilon) = \begin{bmatrix} 1 & 1 \\ 0 & 1 + \varepsilon \end{bmatrix}$$

which have distinct eigenvalues for $\varepsilon \neq 0$ but suddenly become non-diagonalizable at $\varepsilon = 0$. This is interesting because if a matrix has distinct eigenvalues then it has a basis of eigenvectors which is unique up to scaling (note that this is not true if it is diagonalizable but with repeated eigenvalues), but a non-diagonalizable matrix doesn't have a basis of eigenvectors. So, what happens to the eigenvectors of $X(\varepsilon)$ as $\varepsilon \to 0$?

One of them doesn't change at all, namely $v_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$, and this one remains an eigenvector as $\varepsilon \to 0$. So all of the interesting action must be happening with the other eigenvector, associated to the eigenvalue $1 + \varepsilon$. This eigenvector can be taken to be

$$v_{1 + \varepsilon} = \begin{bmatrix} 1 \\ \varepsilon \end{bmatrix}.$$

So now we can see what happens: as $\varepsilon \to 0$ not only do the eigenvalues collide, but the eigenvectors also collide! We can also see that if we had to work numerically to finite precision and $\varepsilon$ was very small, it would be very difficult to tell these eigenvectors apart numerically.

This suggests the general idea that we can think of matrices who have some eigenvalues very close to each other as having eigenvectors which are also very close to each other, which collide if we collide their eigenvalues. Matrices like this are "barely diagonalizable"; it's hard to distinguish these very close eigenvectors, and it gets harder the closer the eigenvalues are.