Eigenvalues and Eigenvectors

Tags

Eigendecomposition

Properties of Eigenvalues

Gradient of eigenvalues

Sum and product of eigenvectors

The trace of a matrix is the sum of eigenvectors!

trA=λitr A = \sum \lambda_i

The determinant of a matrix is the product of eigenvectors!

detA=λi\det A = \prod \lambda_i

Other Properties (cheat sheet)

  1. The determinant of AA is the product of its eigenvalues
  1. An eigenvector is defined as Ax=λxAx = \lambda x, which means that det(λIA)=0det(\lambda I - A) = 0. We can use this to create a polynomial in λ\lambda whose roots are the eigenvalues.
  1. The trace of AA is the sum of its eigenvalues
  1. The rank of AA is equal to the number of non-zero eigenvalues of AA
  1. Inverse matrices have the same eigenvectors as the original matrix but with eigenvalues as multiplicative inverses of each other.
  1. A,ATA, A^T have the same eigenvalues. This goes back to det(A)=det(AT)det(A) = det(A^T)
  1. similar matrices have the same eigenvalues but not necessairly the same eigenspaces
    1. Proof: A=XBX1A = XBX^{-1}. This means that det(AλI)=det(XBX1λI)=det(X(BλI)X1=det(x)det(BλI)det(X1=det(BλI)det(A - \lambda I) = det(XBX^{-1} - \lambda I) = \det(X(B - \lambda I)X^{-1} = det(x)det(B - \lambda I) det(X^{-1} = det(B - \lambda I). Therefore, the characteristic polynomials are the same, and therefore the eigenvalue are the same
  1. Cayley-Hamilton Theorem: πA(A)=0\pi_A(A) = 0, for a diagonaliable matrix (i.e. when you plug the matrix into its characteristic polynomial it always yields a matrix of zeros. Interesting!
    1. proof: A=XDX1A = XDX^{-1}. Therefore πA(A)=XπA(D)X1\pi_A(A) = X\pi_A(D)X^{-1}. Now, πA(D)=0\pi_A(D) = 0 because each element of DD is a root of the characterstic polynomial πA\pi_A. Neat!!
  1. for a square matrix, eigenvectors that correspond to distinct eigenvalues are orthogonal (spectral theorem, more on this later)
  1. right and left eigenvectors are linearly independent sets
  1. eigenvectors corresponding to distinct eigenvalues are linearly independent

Creating eigenvalues

Invariant subspaces

An invariant subspace UU in VV is defined as the following: for all uUu \in U, we have that TuUTu \in U. Some examples of invariant spaces include the null space and the range of TT, but also some other special vector spaces known as eigenvectors

Eigenvalues are defined under the following four equivalent conditions

  1. TλIT - \lambda I is not injective
  1. TλIT - \lambda I is not surjective
  1. TλIT - \lambda I is not invertible
  1. det(AλI)=0\det(A - \lambda I) = 0

It is this last point that we arrive upon the definition of the characteristic polynomial

Characteristic polynomial

πA(λ)=det(AλI)\pi_A(\lambda) = \det(A - \lambda I) is the `characteristic polynomial of AA. The roots of the polynomial are the eignvalues.

The eigenspace VλV_\lambda is just N(AλIn)N(A - \lambda I_n), and it corresponds to all the vectors with eigenvalue λ\lambda

Multiplicity

The algebraic multiplicity of an eigenvalue is defined to be the dimension of the corresponding generalized eigenspace G(λ,T)G(\lambda, T). The geometric mulitiplicity is worried only with E(λ,T)E(\lambda, T).

The sum of multiplicities is equal to dimV\dim V

Mathematically, if πA(λ)=(λλ1)mf(λ)\pi_A(\lambda) = (\lambda - \lambda_1)^mf(\lambda), then λ1\lambda_1 has algebraic multiplicity mm

It is possible for geometric multiplicity to be strictly less then algebraic multiplicity, This is called being defective. In other words, there are some vectors that don't become eigenvectors until the TλIT - \lambda I is raised to a higher power. A defective matrix is not diagonalizable, while a non-defective matrix is diagonalizable, meaning that it has an eigenbasis

Eigenspace

An eigenspace E(λ,T)E(\lambda, T) is just the span of all eigenvectors with eigenvalue λ\lambda. Alternatively, E=null(TλI)E = null(T - \lambda I)

The sum of eigenspaces is a direct sum, because different eigenvectors with different eigenvalues are linearly independent

Basis of eigenvectors

Sometimes you can have a basis of eigenvectors, which means that you have u1,...unu_1, ... u_n such that Tu1=λiuiTu_1 = \lambda_i u_i. The matrix of TT WRT this basis is just a diagonal, whose entries are just the eigenvalues

If such a basis exists, then you can find a mapping to this basis XX and represent the map as X1DXX^{-1}DX or XDX1XDX^{-1}, depending on how you prefer to write it.

We will see some results involving this idea below

Another way of thinking about this

A square matrix has an eigenbasis if you can do A=VDV1A = V D V^{-1} (or the other way around). As such, you can think of an eigenbasis as being able to stretch and shrink space in desired directions

Diagonalization

We will build up the intuition why A=UDUTA = UDU^T. First, let's consider AUAU, where UU is a column matrix made up of the orthonormal eigenbasis u1,...unu_1, ...u_n. This is as follows: (note that they use a slightly different symbol for the diagonal)

Now, note that UUT=IUU^T = I, so we get the following:

The importance of diagonalization

You can think of UU as a map that takes in vectors in an eigenbasis and pushes it over to the canonical basis. As such, UTU^T takes the canonical and moves it to the eigenbasis. This is why it's UDUTUDU^T and not the other way around.

Matrix-vector multiplication and matrix exponentials become very easy when dealing with an eigenbasis representation

Other Formalities

Right and left eigenvector

A right eigenvector is such that Ax=λxAx = \lambda x

A left eigenvector is such that yTA=λyTy^TA = \lambda y^T .

We will care more about right eigenvectors mostly.

Duality

The tranpose of a right eigenvector is a left eigenvector of ATA^T, and the tranpose of a right eigenvector of ATA^T is a left eigenvector of AA. You can prove this by tranposing the equation Ax=λxAx = \lambda x

Properties

Quadratic form decomposition

If you have xTAxx^TAx, you can split it up into

xTUΣUTx=iλix,vi2x^TU\Sigma U^T x = \sum_i \lambda_i\langle x, v_i\rangle^2

Why? Well, the UTxU^Tx is the projection operation, which is equivlaent to inner product for unitary vector. Sigma scales, and associative property means that we can do (xTU)Σ(UTx)(x^TU)\Sigma (U^Tx).

Davis Kahan Theorem

If A,BA, B are d×dd\times d symmetric matrices and ABFϵ||A - B||_F \leq \epsilon, then

vi(A),vi(B)12ϵminjiλi(A)λj(A)|\langle v_i(A), v_i(B)\rangle| \geq 1- \frac{2\epsilon}{\min{j\neq i}|\lambda_i(A) -\lambda_j(A)|}

where λ(A)\lambda(A) represents an eigenvalue of AA, and v(A)v(A) represents an eigenvector of AA.